WOLFGANG U. DRESSLER BASILIO CALDERONE SABINE SOMMERLOLEI KATHARINA KORECKYKRÖLL (EDS.)

# EXPERIMENTAL, ACQUISITIONAL AND CORPUS LINGUISTIC APPROACHES TO THE STUDY OF MORPHONOTACTICS

US-Lay\_1+2 Dressler.indd Alle Seiten 13.10.2021 13:02:41

#### WOLFGANG U. DRESSLER – BASILIO CALDERONE – SABINE SOMMER-LOLEI – KATHARINA KORECKY-KRÖLL (EDS.)

EXPERIMENTAL, ACQUISITIONAL AND CORPUS LINGUISTIC APPROACHES TO THE STUDY OF MORPHONOTACTICS

#### ÖSTERREICHISCHE AKADEMIE DER WISSENSCHAFTEN PHILOSOPHISCH-HISTORISCHE KLASSE SITZUNGSBERICHTE, 915. BAND

#### VERÖFFENTLICHUNGEN ZUR LINGUISTIK UND KOMMUNIKATIONSFORSCHUNG

#### BAND 32

HERAUSGEGEBEN VON WOLFGANG U. DRESSLER

## Experimental, Acquisitional and Corpus linguistic Approaches to the Study of Morphonotactics

edited by

W OLFGANG U. D RESSLER B ASILIO C ALDERONE S ABINE S OMMER - L OLEI K ATHARINA K ORECKY - K RÖLL

Accepted by the publication committee of the Division of Humanities and Social Sciences of the Austrian Academy of Sciences: Michael Alram, Bert G. Fragner, Andre Gingrich, Hermann Hunger, Sigrid Jalkotzy-Deger, Renate Pillinger, Franz Rainer, Oliver Jens Schmitt, Danuta Shanzer, Peter Wiesinger, Waldemar Zacharasiewicz

> Printed with the support of the Austrian Science Fund (FWF): PUB 862-Z

Open access: Except where otherwise noted, this work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/

This publication was subject to international and anonymous peer review. Peer review is an essential part of the Austrian Academy of Sciences Press evaluation process. Before any book can be accepted for publication, it is assessed by international specialists and ultimately must be approved by the Austrian Academy of Sciences Publication Committee.

The paper used in this publication is DIN EN ISO 970 certi¿ed and meets the requirements for permanent archiving of written cultural property.

> Some rights reserved. ISBN 978-3-7001-8714-1 Copyright © Austrian Academy of Sciences, Vienna 2021 Layout: Andrea Sulzgruber, Vienna Print: Prime Rate, Budapest https://epub.oeaw.ac.at/8714-1 https://verlag.oeaw.ac.at Made in Europe

#### TABLE OF CONTENTS


### Introduction

B ASILIO C ALDERONE <sup>1</sup> W OLFGANG U. D RESSLER <sup>2</sup>

Language sounds are realized in several different ways. Every language exploits no more than a subset of the sounds that the vocal tract can produce, as well as a reduced number of their possible combinations. The restrictions and the phonemic combinations allowed in the language de¿ne a branch of phonology called phonotactics.

Phonotactics refers to the sequential arrangement of phonemic segments in morphemes, syllables, and words (Harris 1955) and underlies a wide range of phonological issues, from acceptability judgments (pseudowords like legal <blick> vs. illegal <bnik> in English or legal <*Pfraus*> vs. illegal <*Xraus*> in German) to syllable processes (the syllabic structure in a given language is based on phonotactic permissions in that language) and the nature and length of possible consonant clusters (which may be seen as intrinsically marked structures with respect to the preferred CV template). This volume deals only with consonant clusters.

The study of phonotactics entails a set of problematic aspects due to its nature. In fact, if, on the one hand, phonotactics is part of the phonological grammar of the language and appears as a rules-based system, on the other, it is controlled by a number of non-categorical, probabilistic and gradient constraints. Often the researcher is faced with a series of apparent contradictions and empirical problems that require critical comparisons of alternative explanatory models and, most often, an investigation of the 'interfaces' and 'intersections' between phonotactics and other levels of linguistic organization, particularly phonetics or, instead, only phonology and morphology.

However, this volume focuses on experimental, acquisitional and corpus linguistic aspects of *morphonotactics*, which represents an intersection area between phonotactics and the morphemic structure of the language.

In particular, morphonotactics deals with the interplay between the ordering restrictions of morphemes (the so-called *morphotactics*) and the

<sup>1</sup> CNRS, CLLE-ERSS, University of Toulouse (UT2), Toulouse, France.

<sup>2</sup> Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) of the Austrian Academy of Sciences, Vienna & University of Vienna.

phonemic sequences of consonant clusters (*phonotactics*). More speci¿ cally, a consonant cluster is *phonotactic* in the strict sense when it occurs within a morpheme (such as /kt/ in German *nackt* 'naked' or *Akt* 'act', or in English <act> or <detect>, or as /nd/ in German *Kind* 'child' or *Rand* 'edge', or in English <kind> or <sound>). A consonant cluster counts as morphonotactic when it results from a morphological operation such as concatenation (such as /kt/ in German *zuck-te* 'jerked', *tank-te* 'refuelled', and in English <kick-ed>, <thank-ed>). It is important to note that apart from clusters that are purely phonotactic (such as ¿nal /mp/ in English <limp>, or ¿nal /mpf/ as in German *Dampf* 'steam') or purely morphonotactic (such as ¿nal /md/ in English <seem-ed> or ¿nal /ƾkst/ in German *lenk-st* (steer-2SG) 'you steer'), many clusters can occur both phonotactically and morphonotactically (e.g. the cluster /kt/ in German as in *Akt* 'act' vs. *zuck-te* 'jerked', or /nd/ in English as in <kind> vs. <sign-ed>). A much less frequent interaction between phonotactics and morphotactics takes place when phonological deletion produces a consonant cluster in inÀection or word formation (as in German *Risiko* 'risk', *Risk-en* (risk-PL) 'risks', *risk-ant* (risk-ADJ) 'risky').

A main focus of several of the articles collected in this volume is the Strong Morphonotactic Hypothesis (SMH), as proposed by Dressler and Dziubalska-Koáaczyk (2006). The SMH claims that morphonotactic consonant clusters are favoured in processing and acquisition compared to phonotactic clusters. This implies a synergy of morphology and phonology, with the acquisitionist effect that morphonotactic consonant clusters are acquired earlier than corresponding phonotactic clusters. This was ¿rst interpreted as an earlier emergence of morphonotactic clusters (i.e. when a consonant cluster is ¿rst produced correctly), but Keliü and Dressler (2019) have shown for Croatian that it rather holds for mastery of consonant clusters (i.e. when a cluster continues to be produced correctly). As to processing, the SMH claims that morphonotactic clusters are processed more accurately and rapidly than corresponding phonotactic clusters. These claims have been restricted to languages or morphological components with a rich morphology, such as inÀection in Slavic languages, whereas they may not hold for languages and morphological components with less morphological wealth, such as inÀection in Germanic languages. See, for more on this, the chapter by Sommer-Lolei et al. in this volume.

The articles represent the result of a joint French-Austrian interdisciplinary project with the title 'Human Behaviour and Machine Simulation in the Processing of (Mor)Phonotactics' funded by the ANR (Agence Nationale de la Recherche, ANR-13-ISH2-0002) and the FWF Austrian Science Fund (Fonds zur Förderung der wissenschaftlichen Forschung, I-1394-G23). The main focus of investigation of the project was the study of the psycho-computational representation of (mor)phonotactics in French and German speakers from two angles simultaneously: *human behaviour* and *machine simulation*. Both of them cover a broad range of activities: from computational simulations (computational models appealing solely to distributional information for the linguistic data and processing the statistical regularities of representative corpora) to longitudinal studies of acquisition (in order to test whether there are systematic differences with regard to the phases at which phonotactic and morphonotactic clusters are acquired), psycholinguistic analyses (aiming at verifying the psychological plausibility of hypotheses on the phonotactics/morphonotactics distinction) and production analysis (focusing on the phonetic repair mechanisms and the systematic differences in the production of morphonotactic and phonotactic clusters in actual speech data). The present volume contains ¿ve papers focusing on the acquisition, speech production, processing and corpus linguistic analysis of morphonotactic vs. phonotactic clusters.

The papers combine distributional analysis and experimental investigations based on large corpora or on the analysis of the speakers' behaviour in producing phonotactically marked structures such as consonant clusters.

The ¿rst paper of the collection 'German phonotactic vs. morphonotactic obstruent clusters: a corpus linguistic analysis' by **Wolfgang U. Dressler** and **Alona Kononenko-Szoszkiewicz** presents a corpusbased study of the obstruent clusters in German. In particular, the paper investigates the distribution, in terms of type and token frequency, of triple consonant clusters (excluding glides) containing two obstruents. The study is framed within the NAD (Net Auditory Distance) model, a net reÀection of the difference between adjacent segments in terms of the manner and place of articulation (Dziubalska-Koáaczyk 2002). One main result discussed by the authors is that, according to NAD predictions, (at least triple) morphonotactic clusters are preferred over phonotactic clusters for German word-¿nal position, which supports the Strong Morphonotactic Hypothesis (SMH, as described above). This must be compared with psycholinguistic evidence, as reported in the chapter by Sommer-Lolei et al. (below). The typological characterization of the German language with regard to the word-¿nal and word-initial obstruent clusters, in contrast to Slavic and other Indo-European languages, is also discussed at the end of the paper.

The paper 'Morphonotactics in speech production' by **Hannah Leykum** and **Sylvia Moosmüller†** investigates the inÀuence of morphology on the phonetic realization of utterances. The authors perform acoustic analyses of word-medial and word-¿nal consonant clusters, which could occur both within a morpheme as well as across morpheme boundaries. The hypothesis underlying the study is that consonant clusters across word-internal morpheme boundaries (morphonotactic clusters) are expected to be more robust and more highlighted in speech production than consonant clusters within a morpheme (phonotactic clusters). The analyses are conducted in three different language types: a word language (Standard German German, SGG), a mixed-type language (Standard Austrian German, SAG) and a quantifying language (Standard French, FR). These three types were chosen to investigate whether languagetype-speci¿c timing characteristics have an inÀuence on the highlighting/reduction of consonant clusters. Concerning the language type, the authors hypothesize that differences between phonotactic and morphonotactic clusters are more pronounced in SGG as compared to SAG, and the differences are expected to be greater than those in FR for both varieties of German. The results of the analyses fail to con¿rm the main hypothesis and reveal that there is no difference in respect of durational and intensity characteristics between clusters with and those without a morpheme boundary. However, as the authors state, the absence of any effects does not necessarily imply that no direct inÀuence of morpheme boundaries on the realization of consonant clusters exists, thus overriding an impact on phonology. Besides language-speci¿c timing characteristics, other language-speci¿c differences could exist. The three investigated languages share a low morphological richness, raising the question of whether the morphological richness of a language determines whether phonotactic and morphonotactic clusters behave the same or not. It is possible that in morphologically richer languages, the information about the morpheme boundary is more important to ensure intelligibility.

The paper 'The acquisition and processing of (mor)phonotactic consonant clusters in German' by **Sabine Sommer-Lolei, Katharina Korecky-Kröll, Markus Christiner** and **Wolfgang U. Dressler** presents a set of psycholinguistic experiments testing the Strong Morphonotactic Hypothesis (SMH) which claims that morphonotactic consonant clusters foster processing and acquisition. The two psycholinguistic processing methods used are progressive demasking and lexical decision. The results partially con¿rm the SMH, showing a signi¿cant positive effect only for rich compounding, a partial trend for less rich derivational morphology and no effect for inÀection, which is relatively poor in German and thus cannot facilitate lexical processing. A psycholinguistically important ¿nding is that familiarity often has a greater facilitating effect than frequency. The acquisition part of the chapter presents longitudinal data up to 3 years (30) of age and quasi-longitudinal transversal data up to a mean age of 48. Since the data does not suf¿ce for making separate statistical analyses for inÀection, derivation and compounding, no facilitating effects have been found in previously reported analyses. This contrasts with the facilitation effects found in morphology-rich Lithuanian, Polish and Croatian inÀectional and derivational morphology.

The paper 'Exploring phonotactic and morphonotactic constraints in the acquisition of consonant clusters in L1 French' by **Barbara Köpke, Olivier Nocaudie** and **Hélène Giraudo** focuses on the possible effects of age, position and phonotactic vs. morphonotactic status in the successful pronunciation of the different French consonant clusters. The authors analyse distributionally longitudinal CHILDES data from four children (aged 16 to 30) collected in spontaneous speech interactions between a parent and the child. The analysis shows a high variation of error types (such as reduction, substitution, omission, repetition, epenthesis, shifted cluster or mixed sounds) in the characterization of consonant clusters. A more detailed exploration of the individual developmental trajectories, however, demonstrates the presence of an overall developmental pattern with the number of omissions decreasing while the number of reductions increases within the age groups. Concerning the consonant cluster's position in the word, overall French children have a tendency to a left-side preference in the development of the pronunciation of clusters. Finally, also, the morphonotactic status of the cluster seems to have a signi¿cant effect on the development of pronunciation, although only in a medial position. According to the authors, this positive effect of the morphonotactic status should be pondered in relation to several factors inherent to the corpus which may modulate and affect the results. In particular, morphonotactic clusters are relatively scarce in French and they never appear in the word-initial position, in contrast to a medial position and especially the word-¿nal position, which seems a less favourable position in early acquisition. This and other considerations led the authors to the conclusion that an extension of the study to later developmental stages in older children, with a consistent vocabulary between the age groups, is needed, in order to weigh in detail the inÀuence of frequency and position effects in the error patterns related to the (mor)phonotactic status of consonant clusters.

The last paper 'The natural perceptual salience of af¿xes is not incompatible with a central view of morphological processing', by **Hélène Giraudo, Karla Orihuela, Basilio Calderone** and **Barbara Köpke** reports on a set of behavioural experiments testing the reactions of French adults in a letter search task. The authors discuss the issue of morphonotactic processing through the notion of morphological salience – the functional and perceptual relative prominence of the whole word and its morphological components – and its implications for theories and models of morphological processing. With regard to the SMH, the task was carried out using words that include the target letter after a morphonotactic boundary (e.g. *vivre* 'to live' which contains *viv*- as a morphological base (stem) and *re*- as a suf¿x, a marker of an in¿nitive) against those with a purely phonotactic one (e.g. *centre* 'centre' in which -*re* is not a suf¿x and *cent*- is not a stem). The main hypothesis is that morphonotactic segmentation should be facilitated due to a double salience conveyed in the boundaries, as it is not only phonological but also morphological. The effects of position, initial vs. ¿nal, are also explored. The ¿nal results show that prototypical morphonotactic sequences are processed faster than phonotactic sequences in a ¿nal position, suggesting that phonotactics helps to decompose words into morphemes by enhancing their morphological salience.

Taken together, the papers offer an interdisciplinary view of (mor)phonotactics, as they provide acoustic-phonetic, psycholinguistic and corpusbased evidence in support of the proposed theoretical claims about the nature of phonotactic and morphonotactic structures.

Another merit of the present volume is its crosslinguistic methodology, including two phonologically and morphologically relatively distant languages such as French and German.

Comparing phonotactically very different Germanic and Romance languages in the analysis provides a larger and more informative picture of (mor)phonotactics in these two languages.

Behavioural analyses are of particular relevance for the development of crosslinguistically valid generalizations on (mor)phonotactic processing. Psycholinguistic tests applied to the two languages may help to de¿ne a continuum of phonotactic and morphonotactic complexity, with respect to which the two languages will occupy partly different and partly overlapping positions. Similarly, the crosslinguistic differences which emerge from the analysis of speech production contribute to the de¿nition of the continuum of (mor)phonotactic complexity.

We hope that this special issue will provide inspiring suggestions for further investigations, including interdisciplinary approaches, within the domain of the acquisitional, cognitive and physical aspects of sound organization in languages, thus contributing to our knowledge of how human speech structures are acquired, mentally organized and physically produced.

The volume is dedicated to the memory of our colleague Sylvia Moosmller (1954–2018) who died shortly after ¿nishing her part of the joint contribution with Hannah Leykum.

#### REFERENCES

Dressler, Wolfgang U. & Dziubalska-Koáaczyk, Katarzyna (2006) Proposing morphonotactics, *Italian Journal of Linguistics* 18, 249–266.

Dziubalska-Koáaczyk, Katarzyna (2002) *Beats-and-Binding Phonology*. Frankfurt: Lang. Harris, Zellig S. (1955) From phoneme to morpheme, *Language* 31(2), 90–222.

Keliü, Maja & Dressler, Wolfgang U. (2019) The development of morphonotactic and phonotactic word-initial consonant clusters in croatian ¿rst-language acquisition, *Suvremena Lingvistika* 45(2), 179–200.

### I. German phonotactic vs. morphonotactic obstruent clusters: a corpus linguistic analysis

W OLFGANG U. D RESSLER 1,2 A LONA K ONONENKO - SZOSZKIEWICZ <sup>1</sup>

#### 1. INTRODUCTION

#### 1.1. AIMS

In this contribution we provide for the ¿rst time a typological characterology (in the sense of Mathesius 1928 Lang & Zifonun 1996) of the morphonotactics vs. phonotactics of a single language, compared to contrastive studies such as Dressler et al. (2015) on German vs. Slovak and Zydorowicz et al. (2016) on Polish vs. English. We focus on word-initial and word-¿nal positions (cf. section 4) and on triple consonant clusters (excluding glides) containing two obstruents, because these are more typical for German than for many other languages. We approach them in terms of an interaction between Natural Phonology and Natural Morphology and the Beats-and-Binding phonotactics of Dziubalska-Koáaczyk (2009). We limit our investigation to standard vocabulary and exclude onomastics, because it contains clusters that do not occur in standard vocabulary, such as *gm-* in many place names (*Gmünd, Gmunden* etc.).

With regard to phonological typology, German, like other Germanic languages, is a rather consonantal language in respect of the relative amount of its consonantal inventory and its variety and complexity of consonant clusters (cf. Maddieson 2006, 2013 Donohue et al. 2013), although – in contrast to several Slavic languages, for example – German has syllabic sonorants only in an unstressed position in casual speech. German has several voiceless affricates, among the typologically rather rare ones the labial-labiodental /p௬f/ (Luschützky 1992). German is richer in consonant clusters word-¿nally than word-initially, in contrast to most Romance and many other non-Germanic Indo-European languages. Phonological typology, though discussed at least since Trubetzkoy (1939),

 <sup>1</sup> Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) of the Austrian Academy of Sciences, Vienna.

 <sup>2</sup> University of Vienna.

has focused on the characteristics of phonemes, phoneme oppositions and phoneme inventories. If phonotactics has been treated at all, then it is in terms of syllable structures. Even the recent publications of Hyman (2007), Blevins (2007) and Hyman and Plank (2018) mention consonant clusters at most in passing and never discuss triple or quadruple clusters (for contrastive studies of German, see section 1.6). This lacuna may be due to phonological typologists not working with large electronic corpora, which we do for German in this contribution.

In continuation of previous theoretical and contrastive work (Dressler & Dziubalska-Koáaczyk 2006 Dressler, Dziubalska-Koáaczyk & Pestal 2010 Korecky-Kröll et al. 2014) we are going to characterize German patterns of consonantal morphonotactics vs. phonotactics from a phonological, morphological, typological and corpus linguistic perspective.

We investigate prototypical rather than non-prototypical cases of morphonotactics, i.e. the prototypical case of merely concatenative shapes of morpheme combinations, particularly when they differ from the phonotactics of lexical roots and morphemes and thus signal morpheme boundaries, as in English *seem-ed* /si:m-d/ (i.e. there is no lexical ¿nal >-md@ cluster in English). The non-prototypical case of morphological combinations resulting in vowel deletion is marginal in German, e.g. in *Risiko* 'risk', adj. *risk-ant* 'risky' (in contrast to the regular case of schwa deletion, more in section 4).

#### 1.2. PHONOTACTICS VS. MORPHONOTACTICS

Morphonotactic clusters differ from phonotactic ones through the interaction of morphotactics with phonotactics (Dressler & Dziubalska-Koáaczyk 2006 Calderone, Celata & Laks 2014 Zydorowicz et al. 2016). More speci¿cally, morphonotactic clusters are either due to the addition of a further morpheme, an af¿x in the case of derivational morphology or another lexical morpheme in the case of compounding, or due to a subtractive morphotactic operation which leads to vowel deletion, as in Ger. *silbr-ig* 'silvery' from *Silber* 'silver' (more in section 4.2).

Because of this interaction between morphology and phonology, it has been claimed (Dressler & Dziubalska-Koáaczyk 2006: 19–20) that in general morphonotactic clusters are less preferred than phonotactic ones. This contrasts with the Strong Morphonotactic Hypothesis (Dressler & Dziubalska-Koáaczyk 2006 Dressler et al. 2010), which states that in processing and ¿rst language acquisition the interaction of morphology with phonotactics facilitates both processing and acquisition. A further claim on the interaction between morphology and phonology has been made by Shosted (2006), who has found a (statistically insigni¿cant) trend of a positive correlation between complexity in the syllable structure and morphological complexity. It would be worth separating phonological and morphonotactic clusters, because only complex morphonotactics should correlate with morphological complexity.

In order to de¿ne the level of deviation of morphonotactic (i.e. morphologically and phonologically motivated) consonant clusters from purely phonotactic (i.e. merely phonologically motivated) ones in German, we have applied the gradual scale proposed by Dressler and Dziubalska-Koáaczyk (2006). These are clusters such as the following English ones:

1) Clusters which are always morphologically motivated, i.e. never occur in monomorphemic words (cf. Dressler 1985: 220 f.). To this group belongs a consonant cluster /-md/ which always occurs in past participles due to concatenation of a sonorant with the suf¿x, as in *seem-ed, claimed.* Other examples of this group are the word-¿nal consonant clusters /-fs, -vz/ as in *laughs, loves, wife's, wives*, which occur only in plurals, third person singular present forms and in Saxon genitives.

2) Clusters, which are morphologically motivated as a strong default, i.e. which are paralleled by very few exceptions of a morphologically unmotivated nature. For instance, the cluster /ts/ in most cases occurs across word boundaries, as in *lets*, *meets,* but also in morphologically simple words as in *quartz*, *hertz*. Moreover, in English a strong default is present in a cluster /-ps/ as in *steps*, *keeps*, except the borrowings from Latin such as *apse*, *lapse*, and *glimpse*.

3) Clusters, which are morphologically motivated as a weak default, i.e. which are paralleled by more exceptions of a morphologically unmotivated nature. An example is the consonant cluster /-ks/, which is always morphonotactic in the third person singular verb endings and in plurals as in *speaks*, *oaks*, and a phonotactic cluster related to the spelling <x> as in *fox, mix.*

4) Clusters, whose minority is morphologically motivated, i.e. which are quite normal phonotactic clusters and may also have some morphological motivation. To this group belongs the cluster /-nd/ that occurs across morpheme boundaries in past-tense verbs or past participles as in *grinned, tanned*. Moreover, as a phonotactic cluster, it is present in a number of words such as *hand*, *land*, *around*.

5) Clusters which are only phonotactic, thus never divided by a morpheme boundary, such as /rf, sk/, as in *turf, ask*.

The theoretical background of our contribution is Natural Phonology and Morphology (cf. Dressler 1984 Dziubalska-Koáaczyk & Weckwerth 2002 Dziubalska-Koáaczyk 2009 Kilani-Schoch & Dressler 2005 Dressler & Kilani-Schoch 2017), as well as morphonology (Dressler 1985, 1996a,b), of which morphonotactics is a part (Dressler & Dziubalska-Koáaczyk 2006). This approach not only strives towards descriptive und explanatory adequacy but also towards guaranteeing, at least partially, the psychological reality of the linguistic constructs. This demands a psycholinguistic perspective (cf. Korecky-Kröll et al. 2014 and Sommer-Lolei et al. this volume). In usage-based linguistic and psycholinguistic approaches (Bybee 2001 Bauer 2001 Tomasello 2003), it is often claimed that token frequency is important only for the question of storage (which is not an issue here), whereas only type frequency and the discrepancy between high type frequency and low token frequency is relevant for the productivity and pro¿tability of patterns (cf. Du & Zhang 2010 Berg 2014). Here we compare type and token frequencies, in order to evaluate these claims with fresh data.

#### 1.3. BEATS-AND-BINDING MODEL OF PHONOTACTICS

We investigate consonant clusters in the framework of the Beats-and-Binding phonotactic model established by Dziubalska-Koáaczyk (2002, 2009) which is embedded in Natural Linguistics (Dziubalska-Koáaczyk & Weckwerth 2002) and speci¿cally in Natural Phonology. It is a syllable-less model, which explains the organization of consonant clusters in a language where beats constitute vowels (or the marked option of syllabic sonorants) and consonants are typically non-beats. A core of the Beatsand-Binding model is the Net Auditory Distance (NAD) Principle, which started as a modi¿cation of the Sonority Hierarchy principle (Whitney 1865 Sievers 1876 Jespersen 1904 Ohala 1990), called the Optimal Sonority Distance Principle (Dziubalska-Koáaczyk 2002: 82). The present NAD model offers the broadest existing possibility for de¿ning degrees of intersegmental cohesion (Bertinetto et al. 2006) in terms of binding between the beat and adjacent non-beats and between adjacent non-beats, including the preferredness of a cluster.

NAD stands for the measure of auditory distances between neighbouring phonemes and allows construction of the hierarchy of preferences from the most to the least preferred cluster. A preference is understood as basically a universal preference which can be derived from more basic principles (Dressler 1999). A cluster is preferred if it satis¿es a pattern of phonetic distances in terms of the place and manner of articulation plus the sonority between clusters speci¿ed by the universal preference relevant for their initial, medial or ¿nal position in the word (cf. Dziubalska-Koáaczyk 2009, 2014).

It is generally assumed that consonantal languages have more dispreferred consonant clusters than vocalic languages. In order to operationalize this assumption and to determine the status of consonant clusters in German, a software package, namely the Phonotactic Calculator developed by Dziubalska-Koáaczyk, Pietrala and AperliĔski (2014) based on earlier work by Grzegorz Krynicki, can be applied. The default parameter values of the calculator include the manner of articulation (MOA), the place of articulation (POA), and a hierarchy of S/O (sonorant/obstruent) distinctions. Due to the Phonotactic Calculator's settings, the maximum number of consonant sequences to be analysed is bounded by triple clusters. Therefore, the current analysis of cluster preferredness in German is demonstrated based on triple consonant clusters.

Let us present the general predictions for a triple consonant cluster C1C2C3V, ¿rst for the word-initial position:

NAD (C1, C2) < NAD (C2, C3) NAD (C3, V)

It reads: "For word-initial triple clusters, the NAD between the third consonant and the second consonant should be greater than or equal to the NAD between this third consonant and the vowel, and greater than the NAD between the second and the ¿rst consonant´ (Dziubalska-Koáaczyk 2014: 5, also for the following citations).

For the word-¿nal position VC1C2C3 it states:

NAD (V, C1) NAD (C1, C2)>NAD (C2, C3)

The condition reads: "For word-¿nal triple clusters, the NAD between the ¿rst consonant and the second consonant should be greater than or equal to the NAD between this ¿rst consonant and the beat, and greater than the NAD between the second and the third consonant.´

The condition for medial triple clusters VC1C2C3V states:

VC1C2C3V NAD (V, C1) NAD (C1, C2) & NAD (C2, C3) < NAD (C3, V2)

It reads: "For word-medial triple clusters, the NAD between the ¿rst and the second consonant should be less than or equal to the NAD between the ¿rst consonant and the beat to which it is bound, whereas the NAD between the second and the third consonant should be less than between the third consonant and the beat to which it is bound.´

The NAD product indicates a mean number of all the distances between the neighbouring phonemes in the cluster. It was introduced to the calculator in order to assign a preferability index which is "a number denoting a degree to which a given preference is observed´ (Dziubalska-Koáaczyk 2019). The formula for word-initial consonant clusters is as follows:

NAD product = NAD C1C2 – NAD C2V

Thus, it allows the clusters to be ordered according to their degree of preferability values from the most preferred to the least.

#### 1.4. PRINCIPLES OF NATURAL MORPHOLOGY RELEVANT FOR MORPHONOTACTICS

Natural Morphology is a theory of preferences (Dressler 1999 Dressler & Kilani-Schoch 2017) divided into three subtheories. Of the ¿rst one, which accounts for universal preferences, the most relevant for morphonotactics are the parameters of iconicity (especially constructional diagrammaticity) and transparency. In connection with the subparameter of constructional diagrammaticity, German morphonotactic consonant clusters are nearly always due to af¿xation, which is the most iconic operation, whereas anti-iconic subtraction, as in *risk-ant* 'risky', derived from *Risiko* 'risk', is very rare (more in section 3). High transparency favours morphological decomposition, which is undertaken automatically in processing: also from this perspective, af¿xation facilitates decomposition more than word-internal modi¿cation and subtraction, and when a consonant cluster is only morphonotactic, the morpheme boundary is more salient, which facilitates decomposition or segmentation (cf. Korecky-Kröll et al. 2014). Also, high morphosemantic transparency facilitates decomposition, whereas opacity hinders it (Libben 1998 Gagnp 2009: 264–268 Hongbo, Gagnp & Spalding 2011 Dressler, Ketrez & Kilani-Schoch 2017). For example, the relationship between Ger. *Kun-st* 'art' and its verb base *könn-en* 'be able, can' is both morphotactically and morphosemantically obscure (cf. below and section 2.2).

Within the second subtheory, typological adequacy, German can be characterized as a weakly inÀecting language, whose morphology is moderately rich (except in compounding). Thus, compounding may create more morphonotactic clusters than inÀection or derivation. Unfortunately, we cannot investigate systematically word-internal clusters due to compounding because of our corpus there is a lack of corpus linguistic tools for doing this semi-automatically. German is also a more suf¿xing than pre¿xing language. That inÀectional pre¿xation cannot create consonantal clusters, corresponds to the type of suf¿xing language to which German belongs.

Within the third subtheory of system adequacy, the criterion of productivity (Bauer 2001 Dressler, Libben & Korecky-Kröll 2014) is very relevant: productive morphological rules, such as plural formation, inÀection for person and past participle formation, are liable to be involved in many more morphonotactic consonant clusters than unproductive rules, such as deverbal action/result noun formation, such as in *Dien-st* 'service' and *Kun-st* (see above). The endpoint of non-productivity is reached in the case of fossil morphemes, such as the pre¿x in *Aber-glaube* 'superstition', where the base *Glaube* 'faith' is easy to detect. Still we can classify its internal triple consonant cluster /rgl/ as morphonotactic.

Although, from a semiotic point of view morphology is more important than phonology for morphonotactics (Dressler 1985, 1996a), diachronic change may transform morphonotactic clusters into phonotactic clusters, but not vice versa (cf. Dressler et al. 2019).

#### 1.5. DATABASE

The corpus linguistic research was based on the data extracted from the Austrian Media Corpus (AMC), which was developed at the Austrian Academy of Sciences (cf. Ransmayr, Mörth & Matej 2017). It is considered to be one of the largest corpus collections of the German language. It covers all printed resources from Austrian printed media for the last two decades, including the transcripts of Austrian television and broadcast news plus the news reports of the Austria Press Agency APA. This corpus contains about 40 million texts of various genres containing about 10 billion word tokens. It is linguistically annotated with morphosyntactic information and lemmatized. Due to its functionality, a list of all word types and word tokens containing the speci¿c clusters in a given corpus can be selected along with the frequency of occurrence and part of speech. Clearly the numbers of types (inÀectional word forms) given in the lists below refer to what is attested in the AMC the number of potential correct forms is higher.

The starting point of the research was obtaining the data from the AMC. The corpus automatically allows identi¿cation of the position of a cluster, thus different queries were speci¿ed in the research. For instance, for the word-initial position the following query was involved "str.´. It reads word-initial triple cluster /str-/ followed by one or more character. Thus, all consonant clusters along with their frequency of occurrence in the corpus were retrieved, according to their position in the word, for further analysis. The next stage included the elimination of all irrelevant words, such as proper names, misspellings or non-words. The last stage of the analysis was the division of the words into three groups depending on whether the cluster is only morphonotactic, only phonotactic or both.

The second analysis related to measuring auditory distances in the cluster via the NAD calculator, which was introduced in the previous section. All examples are written in the national German orthography. In the German consonantal system, a phoneme <ch> is a voiceless palatal or velar fricative <sch> (and word-initial <s> before a stop) is a voiceless sibilant. For the NAD calculator /r/ is speci¿ed as an uvular liquid approximant.

All clusters will be presented according to their position and each cluster will be exempli¿ed by a single word, selected according to its high token frequency. If the number of word types occurred fewer than ¿ve times in the corpus, these words were eliminated from the analysis because most of them consisted of orthographic mistakes or they were non-words (especially names).

#### 1.6. GERMAN PHONOTACTICS

The phonotactics of German consonant clusters has been described several times. Meinhold and Stock (1980: 180–188) include in their description differences between positions and observe the inÀuence of morphology and of phonostylistics. Hirsch-Wierzbicka (1971) aims to present an exhaustive overview of consonant clusters, but limited to monosyllables. Thus, several word-initial and word-¿nal triple and quadruple consonant clusters are missing (to some extent also for monosyllabic words). There are also incorrect statements about disallowed peripheral clusters. A classical generative account can be found in Heidolph, Flämig and Motsch (1981: 977–990) with the concept of the phonological structure conditions of morphemes (formatives) vs. words.

Szczepaniak (2010: 107) and Fehringer (2011: 97) found speci¿c, but very limited corpus-based evidence that German seems to avoid long word-¿nal morphonotactic consonant groups, insofar as a rising number of consonants correlates with a rising preference for the masculine and neuter genitive allomorph *-es* instead of the allomorph *-s*. This presupposes a continuum for cluster complexity, whereas Wiese (1988, 1991, 2000 cf. Orzechowska & Wiese 2011, 2015) makes a sharp distinction between marked extrametrical consonants (the third and fourth most peripheral consonant of a cluster) and the other consonants of a cluster (more in sections 2.5 and 4.2) loan words are considered to have more extrasyllabic consonants, i.e. more complex consonant clusters (cf. also section 3).

#### 2. WORD-FINAL POSITION

In contrast to most Slavic and Romance and more conservative Indo-European languages, Germanic languages are rather rich in word-¿nal consonant clusters, of both a phonotactic and a morphonotactic nature. Moreover word-¿nal clusters are more complex and more numerous and more varied in types than word-initial ones.

The morphonotactic clusters occur in the ¿nal position in 2nd SG. person and are mainly represented by 3rd SG. verb forms, superlatives or past participles, as shown in Dressler and Dziubalska-Koáaczyk (2006 cf. Dressler et al. 2010). They end with the suf¿xes *-st* (2nd SG., superlative, plus the unproductive deverbal noun-forming suf¿x) and *-t* (3rd SG., past participle and denominal circum¿xes derived from the past participle, ordinal-number-forming suf¿x).

#### 2.1. QUADRUPLE CLUSTERS

All word-¿nal quadruple clusters consist of a sonorant and 3 obstruents, the two last being always /st/. All are either only morphonotactic or morphonotactic by default.

The following 20 clusters are only morphonotactic (always 2nd SG., sometimes also 3rd SG. or past participle):

/-lkst/ (5): *melk-st* '(you) milk', ver-folg-st '(you) persecute',

/-rkst/ (30): *merk-st* '(you) notice', *borg-st* '(you) borrow', past participle *ver-kork-st* 'messed up'. The only phonotactic case occurs in the noun *Gwirkst* that exists only in Austrian dialects and means 'tricky affair': this does not count for the standard.

/-mpst/ (11): *pump-st* '(you) pump', *plumps-t* '(s/he) Àops' = *plumpsst* '(you) Àop' (with obligatory degemination of /ss/),

/-mp௬fst/ (10): *kämpf-st* '(you) ¿ght',

/-nݕst/ (3): *wünsch-st* '(you) wish',

/-nt௬ݕst/ (3): *plantsch-st* '(you) splash', recent English loan words *launch-st*, *lunch-st.* In oral speech, the /s/ is most often reduced after /ݕ, ௬tݕ / when followed by /t/.

/-lfst/ (3): *hilf-st* '(you) help',

/-rfst/(65): *darf-st* '(you) may', *nerv-st* '(you) enervate',

/-rmst/ (29): *form-st* '(you) form'.

/-lmst/ (8): *¿OPVW* '(you) ¿lm',

/-lxst/ (2): *strolch-st* '(you) roam about',

/-rxst/ (11): *schnarch-st* '(you) snore',

/-ft௬sst/ (2): *seufz-st* '(you) sigh': normally the /s/ is fused with the preceding affricate,

/- xt௬sst (3): ächz-st '(you) groan' (same fusion),

/-rt௬sst/ (2): *stürz-st* '(you) fall' (same fusion),

/-lݕst/ (2): *fälsch-st* '(you) falsify', *feilsch-st* '(you) haggle',

/-lt௬sst/ (1): *salz-st* '(you) salt' (same fusion 4 others potential, but not attested).

The following clusters are Gen.SG. of isolated masculine and neuter nouns:

/-ƾkst௬s/ (1*): Hengst-s* 'stallion' (masc.),

/-rpst௬s/ (1): *Herbst-s* 'autumn' (masc.), plus its numerous compounds, /-lpst௬s/ (1): *Selbst-s* 'the self' (neuter),

/-rnst௬s/ (1): *Ernst-s* 'earnestness' (masc.), plus its numerous compounds.

The four following quadruple clusters are morphonotactic only as a strong default:

/-ƾkst/ as in *denk-st* '(you) think' and in a variant pronunciation of *-ngst*, as in *sing-st* '(you) sing', superlatives *jüng-st* 'recently', the morphosemantically somewhat opaque adverb *läng-st* 'for a long time' (closely related to the transparent superlative *der/die/das läng-st-e* 'the longest'). However, there are two phonotactic exceptions: the nouns *Angst* 'fear' and *Hengst* 'stallion'.

/-rpst/ occurs as a morphonotactic cluster in 2nd SG. verb forms in *stirb-st* '(you) die', *wirb-st* '(you) advertise' (and their preterits). The only phonotactic exception is *Herbst* 'autumn' and compounds thereof (with diachronic loss of a schwa, cognate with Engl. harvest).

/-lpst/ is only morphonotactic in *stülp-st* '(you) turn up (the collar)' and *rülps-t* '(s)he burps' = 2nd SG., Part. *ge-rülps-t*. The transitional exception is *selb-st* 'oneself' with a fossil suf¿x, related to *der/die/das-selb-e* 'the same'.

/-rnst/ occurs as a morphonotactic cluster in 2nd SG forms, as in *lernst* '(you) learn', and as phonotactic only in the adj. *ernst* 'earnest' and its conversion into a noun.

Table 1 presents for each cluster the number of word types, its token frequency in the corpus and the type-token ratio. Since the NAD calculator is not able to measure all the distances within the quadruple clusters, no preferences can be deduced, but we chose the type-token ratio (TTR) calculation in order to arrive at some generalizations about the morphonotactic vs. phonotactic distribution of these clusters:


Table 1. Distribution of word-¿nal quadruples

The type-token ratio is the most commonly used index of lexical diversity of a text, i.e. the number of tokens divided by the number of word types (McEnerny & Hardie 2012), which allows us to analyse the lexical variation of vocabulary containing a speci¿c cluster in the corpus.

It can be observed that: 1) the overall number of tokens increases along with the number of word types) 2) the growth of tokens is exponential. Thus, relying on the data from the AMC corpus, it can be concluded that for word-¿nal quadruple clusters the number of occurrences is in direct relation to the type frequency. Although there are also some other exceptions, there is a group of clusters /-lkst, -nݕst, -lpst/ which consist of a sonorant followed by an obstruent plus /st/. They are relatively rare in types, nevertheless they have a high token frequency in the corpus.

Based on the TTR, the groups of word-¿nal quadruple clusters can be clearly distinguished according to 3 intervals: 1) 14 with a TTR between 0.02 and 1.29% 2) 3 with a TTR between 10.38 and 11.38% 3) for 4 clusters the TTR is exactly 100%. In addition, there are 2 with a TTR of 22.22%, 1 at 4.35% and 1 with a TTR of 50%. The TTR in /-rpst/ is the lowest, which means that there are very few words of very high frequency, e.g. *Herbst* 'autumn' is the most frequent word with the ¿nal cluster /-rpst/ in the corpus, the frequency of occurrences being due to a great number of compounds ending in *Herbst*. The second group consists of /-rkst, -rmst, -mp௬fst/, again due to the fact that there are rather few words that occur frequently. Finally, the TTR reaches 100% in the third group, where two words have just one form and two others two forms in the corpus. All clusters which are morphonotactic only as a strong default are in the ¿rst, the largest group.

The highest type and token frequency of /-rpst/ is due to the richness and productivity of German compounding which leads to the high occurrence of morphonotactic clusters in compounds with the ¿nal element *Herbst '*autumn'. Thus, the TTR is by far the lowest of all the quadruple clusters. The next lowest TTR occurs in /-nkst/ which is the only quadruple cluster that includes a phonotactic cluster, i.e. in *Hengst* 'stallion' and its numerous compounds. Something similar to compounding takes place in productive particle word formation. But this pattern generates ¿nal verb clusters only in secondary clauses such as *Wenn du den Schal um-häng-st* 'if you put the scarf around (your neck)', and therefore the token frequency of such word-¿nal morphonotactic clusters is very restricted and thus cannot compete with the number of phonotactic clusters in compounds.

Thus, the type-token ratio proves to be a far better distinguisher of quantitatively similar groups than the type or token frequency.

#### 2.2. TRIPLE CLUSTERS ENDING IN *-T*

As expected, triple obstruent clusters are more numerous and varied than quadruple clusters. Not all of them, but nearly all start with a sonorant. In addition to the two ¿nal obstruents /st/ we also ¿nd /ft/ and combinations of all existing obstruents with ¿nal /s/, of course excluding pre¿nal /s/ due to degemination of /ss/ and pre¿nal /d, t/ because of the fusion of the dental stop and /s/ to an affricate /t௬s/. Due to such fusion, genitives ending in /t௬s/ also exist, such as des *Punkt-s* 'of the point'. We exclude from our investigation triple clusters consisting of 2 sonorants and 1 obstruent, such as /-lmt, -lnt, -rnt/.

The exclusively morphonotactic triple clusters are 24 in number, i.e. 13 more clusters than the morphonotactic quadruple clusters:

/-xst/: *lach-st* '(you) laugh', superlative *höch-st* 'most highly',

/-xt௬st/: 3rd SG. ächz*-t* 'groans' and its participles,

/-fst/: *schaff-st* '(you) create', adverb *zu-tief-st* 'deepest', *nerv*-*st* '(you) get on nerves',

/-mst/: *träum-st* '(you) dream', *bums-t* '(s/he/you) bump(s)' and its participle, *spar-sam-st* 'most thriftily',

/-ݕst/: *wisch-st* '(you) wipe',

/-p௬fst/: *klopf-st* '(you) knock',

/-t௬ݕst/: *rutsch-st* '(you) slip',

/-ft௬st/: only in *seufz-t* '(s)he sighs' (and in the reduced 2nd person, see above, similarly in the following examples), and in the participle *geseufz-t*, and its derived verbs,

/-lft/: *hilf-t* 'helps', in weak past participles (e.g. *ge-golf-t* 'golfed'), and in *elf-t*, *zwölf-t* 'eleventh, twelfth',

/-lxt/: 3rd SG. and past participle *er-dolch-t* 'stabbed'

/- lt௬st/: *walz-t* '(s)he waltzes' and its participle,

/-ntst/: *tanz-t* '(s)he dances' and its participle, *ver-wanz-t* 'buginfested', a circum¿xation of *Wanze* 'bug',

/-lݕt/: only in *fälsch-t* '(s)he falsi¿es' and its participle and derived verbs,

/-mݕt/: only in *ramsch-t* '(s)he buys cheap junk' and its participle and derived verbs,

/-rt௬ݕt/ only in *turtsch-t* 'taps (eggs)' and its participle,

/-nݕt/: *wünsch-t* '(s)he wishes' and its participle,

/-pݕt/: *grapsch-t* 'grabs' and its past participle,

/-rݕt/: *forsch-t* '(s)he researches' and its participle,

/-nt௬ݕt/: *plantsch-t* '(s)he splashes' and its participle.

The following examples can never be the 2nd SG. (due to the phonological reduction of *-s*):

/-nxt/ in the only verb *tünch-t* 'whitewashes', its participles and its derivation into a particle verb,

/-lkt/: *melk-t* '(s)he milks', *folg-t* '(s)he follows' and their participles,

/-mp௬ft/: *kämpf-t* '(s)he ¿ghts' and its participle,

/-mpt/: *pump-t* '(s)he pumps', *bomb-t* '(s)he bombs' and their participles,

/-rpt/: *zirp-t* '(s)he chirps' and its participle, *stirb-t* '(s)he dies'

/-lpt/: *tülp-t* '(s)he turns up' and *wölb-t* 'curves' and their participles.

There are just 2 clusters which are morphonotactic as a strong default (if we take 75% of types as the criterion):

/-lst/: *will-st* '(you) want', *puls-t* '(s)he pulses' (and 2nd SG.) and its participle, adv. *schnell-st* 'most rapidly', but clearly phonotactic in Wulst 'bulge' and its compounds. Doubtful are *Schwul(-)st* 'bombast' and *Ge-schwul(-)st* 'tumour', because most people can relate it to the base verb *schwell-en* 'swell'. But this relation may be classi¿ed as rather metalinguistic there is as yet no evidence that it would be active in processing (e.g. priming) experiments.

/-rt௬st/ as in *schmerz-t* 'it hurts' (also 2nd SG. *schmerz-st*) and its participle, but a unique phonotactic instance in *Arzt* 'physician' and its many compounds.

The following clusters are ambiguous with either a morphonotactic or a phonotactic majority:

/-nst/ as in *dien-st* '(you) serve' and in the homophonous noun *Dien-st* 'service' with an unproductive deverbal nominalization suf¿x, *grins-t* '(s) he grins' (plus 2nd SG.) and its participle, adv. *fein-st* 'in the ¿nest way'. The cluster is clearly phonotactic in *ernst* 'earnest', *sonst* 'otherwise', *Wanst* 'paunch'. We should also add earlier derivations such as *Kunst* 'art' which many relate metalinguistically, against furious artist's opposition, to the verb *könn-en* 'to be able' *Gunst* 'favour', which few relate metalinguistically to the etymologically cognate verb *gönn-en* 'not begrudge smth to smbd' similarly *Brunst* 'sexual heat' to *brenn-en* 'burn'. In terms of types (excluding compounds), the cluster /-nst/ might be called morphonotactic by default, but the 1,993 compounds with the second element *-kunst* render the global type and token frequency of phonotactic clusters the majority.

/-rst/ is morphonotactic in cases such as *war-st* '(you) were', the superlative adverb *schwer-st* 'heaviest', isolated *mors-t* '(s/he/you) send in Morse' and its participle vs. phonotactic *Wurst* 'sausage', *Forst* 'forest', *Durst* 'thirst', *erst* '¿rst' (which, like its English correspondent, was originally a superlative), but most types occur in compounds. *Ober(-) st* 'colonel' is thoroughly lexicalized (morphosemantically opaque), but clearly related to the superlative *der ober-ste* 'the highest'. When excluding compounds, the types are morphonotactic by default.

/-pst/ is morphonotactic in cases such as *tipp-st* '(you) type', *lieb-st* '(you) love', *pieps-t* '(s)he peeps' (also 2nd SG. and particple *ge-pieps-t*), superlative (or, more precisely, excessive) adverb *herz+aller-lieb-st* 'wholeheartedly dearest', phonotactic in *Papst* 'pope', *Obst* 'fruits', *Probst*

'provost'. Again, this cluster can be considered to be morphonotactic by default, when excluding compounds, but the abundant metaphoric compounds of *Papst* make the global type frequency and token frequency of phonotactic clusters majoritarian.

/-rkt/ occurs as a morphonotactic cluster in *merk-t* '(s)he notices' *sorg-t* '(s)he cares' and their participles, but as a phonotactic cluster in *Markt* 'market', *Infarkt* 'infarct' and their numerous compounds. Without these the cluster is morphonotactic by default.

*/*-ƾkt/ (written with also *-ngt)* is morphonotactic by default as in *bring-t* '(s)he brings', if one excludes the noun *Punkt* 'point, dot' with its numerous compounds, again as the richness of German compounding type and token frequency hides the basic default. Another noun with the phonotactic cluster is *Instinkt*.

/-rxt/ (phonetically >rot@) is similarly morphonotactic by default, as in *ge-pferch-t* 'crammed', with the only phonotactic cluster in *Furcht* 'fear' and its numerous compounds.

/-rft/ is similarly morphonotactic by default, as in *wirf-t* 'throws' and *nerv-t* 'enervates', with the phonotactic exceptions *Werft* 'wharf' with its many compounds and *Notdurft* 'need' (where the earlier morpheme boundary before nominalizing *t* is obsolete).

/-nft/ is the only cluster of this subgroup which is phonotactic by default, as in *sanft* 'mild' (Austrian variant *Senft* 'mustard' with a secondarily attached ¿nal /t/). The only morphonotactic exception is the ordinal number *fünf-t* '¿fth', whereas it is improbable that an analogous morpheme boundary is processed in *Brunft* 'rut (of deer)', historically derived from *brenn-en* 'to burn', because of its morphotactic and morphosemantic opacity, and with most nouns analogously derived from particle verbs with the verbal base *komm-en* 'come', such as *Zukunft, Hinkunft* 'future' vs. *zukommen* 'approach, belong'.

/-kst/ (also written -*chst*, -*ckst*, *-gst, -xt*) is morphonotactic by default, as in *wächs-t* 'grows' (also in the 2nd singular *weck-st* '(s)he awakes'), the only phonotactic exceptions are *Text* 'text' and *Axt* 'axe' with their numerous compounds.

There are no other word-¿nal triple consonant clusters with 2 ¿nal obstruents, unless in foreign names, such as *Minsk, Kursk*. Other comparable triple clusters with ¿nal *-t* do not occur, because conceivable and pronounceable clusters such as *-skt, -spt* do not occur as phonotactic clusters and, in contrast to English, they are excluded as morphonotactic clusters, because no verb roots (nor nouns) ending in *-sk, -sp* exist in German. Adjectives ending in *-sk* do not form a superlative in *-sk+st*, but insert an *-e-* before the superlative suf¿x. Other fricatives have a still smaller phonotactic distribution than /s/.

Thus, all word-¿nal triple clusters, which contain two obstruents are morphonotactic (only exception: those in -*nft*), because phonotactic clusters either do not occur or only occur as the exceptions when counted in lemmas. But their type and token number may be competitive with morphonotactic ones due to compounding. Many of the lemmas with ¿nal phonotactic clusters go back to derivations with a morphonotactic cluster.

As expected, morphonotactic clusters ending in the longer suf¿x -*st*  have fewer phonotactic counterparts than morphonotactic clusters ending in the shorter suf¿x -*t*.

Turning to a NAD analysis of triple ¿nal clusters ending in /t/, we start with the presentation of the frequency demonstrated in Table 2:


Table 2. Frequency ranks of word-¿nal triples


In contrast to quadruple clusters, triple clusters do not form several neatly separated groups according to the TTR: the TTR of just 4 clusters is clearly above 1%, one amounts to 20.7% and only one has a TTR of 100%. None of the triple clusters hast just 1 type.

The NAD phonotactic calculator establishes the preferences of the clusters (structure VCCC) as presented in Table 3:


Table 3. Preference rankings of word-¿nal triples according to NAD3

 <sup>3</sup> Three clusters /-nt௬ݕt/, /-t௬ݕst/ and /rt௬ݕt/ were excluded from the analysis because the NAD calculator does not recognize affricate /-t௬ݕ/. Therefore, they were counted manually.


From Table 3 the following conclusions can be drawn:

The majority of preferred clusters start with a rhotic, lateral or nasal sonorant followed by two obstruents or another sonorant. The most signi¿cant distance between the neighbouring phonemes is always greatest when it starts with a rhotic or lateral sonorant, for instance the NAD product of /rpt/ is 5.1 and the NAD product of /rt௬st/ is 3.85.

Out of 33 word-¿nal consonant clusters, 19 clusters are preferred and 14 dispreferred. If we add the 3 clusters that the NAD calculator could not handle, then we obtain 19 preferred clusters and 17 dispreferred clusters.

However, there is the question of whether similar predictions can be deduced in a simpler process of calculation. Since the NAD calculator is the most elaborate tool for deducing the predictions on the degrees of markedness for (mor)phonotactic clusters so far, it is worth trying to modify the method of NAD calculation.

Thus, we applied a factor analysis in order to test whether there is a correlation among the variables which were previously obtained in the present research. For the factor analysis, 30 word-¿nal consonant clusters were selected and 7 independent variables. The ¿rst and second variables are the number of the word types and tokens from the AMC for each cluster followed by the auditory distances between the neighbouring phonemes according to the NAD calculator. The next two variables represent the information whether the cluster is preferred or dispreferred and the division between phonotactic vs. morphonotactic (Phon/morph) clusters as presented in Table 4.


Table 4. Factor analysis for word ¿nal triple consonant cluster

Numbers in bold indicate a signi¿cant correlation among the variables. For instance, in Factor (1) we may observe a certain correlation between NAD (VC) and NAD (C1C2). The possible explanation is that if we look at the NAD table of all 30 clusters, we can see that the measures of NAD (VC) and (C1C2) are inversely proportional to each other in most

of the cases. For instance, if the NAD (VC) is high then the NAD (C1C2) will be smaller. For example, in the word-¿nal cluster Vfst the NAD (VC) is equal to 5 and the NAD (C1C2) is 0.5. And conversely, if we take the cluster Vrpt, where the NAD (C1C2) is equal to 6.6 and NAD (VC)=2.

The next observation is that cluster preferredness is related to the NAD (VC) and the NAD (C1C2). In general, if the NAD (C1C2) is higher than the NAD (VC), then the cluster is more likely to be preferred. This corresponds entirely to the NAD formula for triple ¿nals shown above.

From Factor (2) we can see that there is a certain correlation between word types and tokens. They are connected in the same direction, so we could assume that if the number of word types grows, then the frequency grows as well.

For Factor (3) we can observe that the NAD (C2C3) is not connected to any of the variables, but it is still signi¿cant, presumably to other variables not yet discussed.

Most notably, the factor analysis has shown that the NAD (C2C3) is not related to the NAD (VC) or the NAD (C1C2), which goes against a well-established NAD formula for predicting the preferredness for word- ¿nal triple clusters. Therefore, one assumption that can be inferred is that the NAD distances of two phonemes in the cluster, namely the NAD (VC) and the NAD (C1C2) might be enough to decide on the preferredness of word-¿nal clusters in German. However, more research on consonant clusters in different word positions as well as of different languages is needed in order to corroborate this statement. For that reason, we have compared the cluster preferredness of German, English and Polish in the word-initial and word-¿nal positions via the NAD calculator when the most peripheral consonants were excluded from the analysis. The results are discussed in section 4.2.

If we compare the preference predictions in Table 3 or just compare its third and fourth columns, where the NAD (C1C2) should be bigger than the NAD (VC), and if we split Table 2 into two based on the frequency ranking, putting 18 clusters into the ¿rst half and 18 into the second, then we ¿nd 11 preferred and 7 dispreferred clusters within the ¿rst group, and 10 preferred and 8 dispreferred clusters in the second half. This is a positive, i.e. supportive, but not a signi¿cant difference. With regard to the claim that phonotactic clusters are more preferred than morphonotactic clusters, we found that among the exclusively morphonotactic clusters, 14 are preferred and 11 dispreferred, whereas among those clusters which are both morphonotactic and phonotactic, 7 are preferred and 4 dispreferred. This is again a positive but not a signi¿cant difference.

Moreover, all (but one) of the word-initial triple clusters, which are all exclusively phonotactic, are preferred clusters. And this seems to represent a very signi¿cant difference from the mainly morphonotactic word- ¿nal clusters. However, the triple ¿nal clusters ending in *-s* (discussed in the following section 2.3) are all exclusively morphonotactic and all preferred clusters.

#### 2.3. TRIPLE CLUSTERS ENDING IN *-S*

A further source of word-¿nal morphonotactic obstruent groups is the nominal -s Gen.SG., less commonly the homophonous plural suf¿x as in *Kalb-s* 'calf', (also plural), *Korb-s* 'basket', *Ge-zirp-s* 'chirping', *Schilf-s* 'reed', *Dorf-s* 'village', *Nerv-s* 'nerve', *Talg-s* 'tallow'. Parallel phonotactic clusters occur in *Rülps* 'belch' and *Mumps.* Similar morphonotactic clusters arise through the suf¿xation of plural *-s*, as in Gen.SG*.* and PL *Tank-s*, *Skalp-s* 'scalp', *Ulk-s* 'trick', and adverbial *-s,* as in *aller-ding-s* 'indeed'*.* 

Word-¿nal, exclusively morphonotactic, triple clusters with /s/ at the end are the following (all Gen.SG., if also plurals, then explicitly noted):

/-rps*/*: *Bewerb-s* 'competition', *Korb-s* 'basket' and their numerous compounds,

/-rfs/: *Dorf-s* 'village', *Wurf-s* 'throwing' and *Nerv-s* 'nerve' and their numerous compounds,

/-rks/ as in Gen.SG. *Bezirk-s* 'district', Gen.SG. and PL of recent English loan-words, such as *Park-s*. A phonotactic exception is *Murks* 'botch',

/-rxs/: *Monarch-s* with a few compounds,

/-rݕs : *Hirsch-s* 'stag',

/-lfs/: *Wolf-s* 'wolf',

/-lks/: *Erfolg-s* 'success', *Volk-s* 'people, folk',

/-lxs/: *Elch-s* 'elk' with several compounds,

/-nks/: also PL in the English loan word *Song-s*, only adverb *link-s* 'to the left,'

/-nݕs/: *Wunsch-s* 'wish' with a few compounds,

/-nxs/: only *Mönch-s* 'monk' with its many compounds.

/-nt௬ݕs/: only in English loan words, e.g. *Brunch-s* (more than 60% plurals, less than 40% Gen.SG. in the average),

/-mps/: only in English loan words (also PL), e.g. *Vamp-s* a phonotactic exception is the loan word *Mumps*,

/-lps/ occurs only in *Kalb-s* 'calf' and in the loan word (also PL)

*Skalp-s* 'scalp' and their compounds a phonotactic exception is the onomatopoeic *Rülps* 'belch',

/-mp௬fs/: *Kampf-s* '¿ght' and its compounds,

/-mݕs/ only in *Ramsch-s* 'junk',

/-sks/ only in loan words (also PL), e.g. *Disk-s*.

The frequency ranking of these clusters is presented in Table 5:


Table 5. Frequency ranks of triple clusters ending in *-s*

The spread of the TTR is similar to the triple clusters ending in /t/, but there is one cluster with only one type.

The preferences established by the NAD calculator for VCCC clusters are the following (see Table 6):

Table 6. Preference rankings of word-¿nal triples ending on *-s* according to NAD



Thus, all triple clusters ending in *-s* are preferred clusters, although all of them are exclusively morphonotactic, two of them with a marginal phonotactic exception.

Also, there are several morphonotactic double ¿nal morphonotactic consonant clusters with an affricate /t௬s/, due to Gen.SG. and rarely PL *-s*: /xt௬s/ as in *Berichts* 'report', /kt௬s/ as in *Projekts* 'project', /pt௬s/ as in *Konzepts* 'concept', /lt௬s/ as in *Anwalts* 'lawyer', /nt௬s/ as in *Abends* 'in the evening', and /rt௬s/ as in *Jahrhunderts* 'century'. The only phonotactic correspondents are words such as *Holz* 'wood', *Tanz* 'dance', *Scherz* 'joke', i.e. if a sonorant precedes an affricate.

A problem is represented by imperatives of the type *knicks*! 'curtsey!', *schubs*! 'push!'. First, it is unclear whether the word-¿nal *-s* is synchronically still a derivational suf¿x. Second, even if not, it is unclear whether such imperatives are to be classi¿ed as base forms (if yes, then phonotactic) or as morphologically derived from the in¿nitive as a lexical entry.

#### 2.4. TRIPLE CLUSTERS ENDING IN *7ࣴ6*

The masculine and neuter Gen.SG, *-s* (potentially, also of the homophonous plural suf¿x, but actually only in a single cluster) is the source of nearly always morphonotactic clusters ending in the affricate -*ts* due to fusion of the inÀectional suf¿x with a stem-¿nal dental stop (for frequency ranks see Table 7):

/-rst௬s/: *Durst-s* 'thirst',

/-lst௬s/: *Schwulst-s* 'bombast',

/-pst௬s/: *Papst-s* 'pope', *Herbst-s* 'autumn' and their many compounds,

/-nst௬s/: *Dienst-s* 'service' and its many compounds,

/-rkt௬s/: *Markt-s* 'market' and its many compounds,

/-nkt௬s/: *Punkt-s* 'point' and its many compounds,

/-nfts/: *Senft-s* 'mustard',

/-rpt௬s/ only in *Exzerpt-s* 'excerpt',

/-tst௬s/ only in *Arzt-s* 'physician' with its many compounds,

/-kst௬s/ only in *Text-s* and its compounds


Table 7. Frequency ranks of triple clusters ending in *-ts*

Here we have no groupings of clusters according to TTR, but there are three clusters with just one type. Again, all clusters are preferred according to the NAD calculator, although all of them are exclusively morphonotactic.

#### 2.5. WORD-INITIAL POSITION

The German standard has no monoconsonantal pre¿xes, in contrast to Bavarian-Austrian dialects, as in *g'storben* 'died', *b'soffe*n 'drunk', *z'ruck* 'back(wards)' etc., corresponding to Standard German *ge-storb-en*, *besoff-en*, *zu(-)rück*. Thus, the German standard is rather poor in word-initial clusters, all word-initial clusters are exclusively phonotactic. Some of the more dispreferred ones occur only in loan words from Ancient Greek and their derivations, e.g. /mn-/. German phonotactic initial double clusters were partially studied in Dziubalska-Koáaczyk (2002) with regard to universal phonotactic preferences. Moreover, double obstruent clusters serve as a basis for the complexity of triple initial clusters.

Phonotactic preferences for word-initial clusters in German have been studied by Orzechowska and Wiese (2011, 2015). They proposed an alternative approach to the NAD which is not limited to the size of the cluster and is not based on a sonority hierarchy but on an empirical analysis of features. The analysis of German initial clusters was based on 15 parameters, which included different values such as the cluster complexity, place of articulation, manner of articulation and voicing, in order to build a quantitative ranking of all clusters in terms of adherence to the preferences established by the Sonority Sequencing Generalization. This last approach will not be followed here.

For our study, the most interesting word-initial double clusters consist of two obstruents, particularly with a fricative in ¿rst position and a stop in second position: /ݕt-/ as in *statt* 'instead of' and /ݕp-/ as in *spielen* 'to play'. Words of foreign origin can also start with /sk-/ as in *skeptisch* 'sceptical', /sp-/ as in *Spatium* 'space', /st௬s-/as in *szenisch* 'scenic', isolated /xt-/ as in *chthonisch* 'chthonic', and /ft-/ as in *Phthisis* 'wastage'.

A fricative is followed by another fricative, or rather approximant, in /ݕv-/ as in *schwer* 'heavy', or in loan words in /sv-/ as in *Sweater*, /sf-/ as in *sphärisch* 'spherical', or /sx-/ as in *Schizophrenie* 'schizophrenia', and by an affricate in /ݸ௬v-/ as in *zwei* 'two'.

An obstruent is followed by a sonorant, ¿rst as a fricative, as in *schreiben* 'to write', /ݕm-/ as in *schmecken* 'to taste', /ݕn-/ as in *schneiden* 'to cut', /ݕl-/ as in schlieen 'to close', /À-/ as in *ÀDFK* 'Àat', /fr-/ as in *fragen* 'to ask', /vr-/ as in *Wrack* 'wreck', only in loan words /sm-/ as in *Smaragd* 'emerald', /xr-/ only in the isolated learned loan word *Chrie* 'school theme', (/vl-/ only in foreign names such as *Vladimir, Wladiwostok*).

A stop is followed by a sonorant in /gr-/ as in *groß* 'large', /gl-/ as in *glücklich* 'happy', /gn-/ as in *gnadenlos* 'merciless', /kl-/ as in *Kleid* 'dress', /kr-/ *krank* 'sick', /kn-/ as in *Knie* 'knee', /bl-/ as in *bleiben* 'to stay', /br-/ as in *brechen* 'to break', /pl-/ as in *plump* 'clumsy', /pr-/ as in *Pracht* 'splendour', /dr-/ as in *drei* 'three', /tr-/ as in *tragen* 'to wear'. An affricate is the ¿rst obstruent in p௬À as in *p ࣴÀHJHQ* 'to care for', /pfr-/ as in ௬ *pfropfen* 'to graft'.

A stop is followed by a fricative in words of foreign origin in /ks-/ as in *Xenophobie* 'xenophobia' or /ps-/ as in *psychisch* 'psychological'. A stop is followed by the fricative or approximant /v/ in /kv-/as in *Quelle* 'source', or by an affricate in /t௬sv-/ as in *Zwang* 'coercion'.

A sequence of word-initial stops is limited to words of Ancient Greek origin: /pt-/ as in *Pteridin* 'pteridine', /kt-/ as in *ktenoid* 'ctenoid'.

The majority of double clusters that do not occur only in learned words of foreign origin respect the preferences of the Beats-and-Binding-Model (Dziubalska-Koáaczyk 2002: 112).

In this contribution, we stick to the longer clusters with the maximum number of consonants in the onset, which is three. There are eight types of triple initial consonant clusters in German (see Table 8). All of them consist of two obstruents plus a sonorant or approximant: /ݕtr-/ as in *streng* 'strict', /ݕpr-/ as in *spricht* 's/he speaks', /ݕpl-/ as in *Splitter* 'splinter' next in words of foreign origin /skr-/ as in *skrupellos* 'ruthless', /skl-/ as in *sklavisch*, adjective of 'slave'. In more recent loan words we ¿nd also / skv-/ as in *Squaw* (the only integrated loan word with this cluster, with the possible exception of *squash*), /spr-/ as in *Sprinter* and /spl-/ as in *Spleen*.


Table 8. Frequency ranks of triple word-initial clusters

These triple clusters also exhibit no grouping according to TTR only one cluster has just one type.


Table 9. Preference rankings of word-initial triples according to NAD

Table 9 presents the NAD analysis of these clusters and the quanti¿cation of rising preferences. For word-initial consonant clusters we undertook an analogous factor analysis as for the word-¿nal consonant clusters in section 2.2. When eliminating the ¿rst consonant, the two remaining NAD distances, NAD (C2C3) and NAD (CV), again showed the same preferences as when including the ¿rst consonant, i.e. we arrived at the same result as in section 2.2.

In conclusion we can see that:

1) All word-initial triple clusters consist of initial double obstruent clusters of a s(h)ibilant plus a stop followed by a rhotic or lateral sonorant or the fricative/approximant /v/. Other double clusters which occur in the word-initial position, i.e. /bl, br, gr, gl, gn, gm, dr, xr, xt, kn, p௬À, p௬fr, ݕl ݕv, ݕr, ݕm, ݕn, ps, sf, sm, st௬s, t௬sw/ cannot be part of a word-initial triple cluster, except for extragrammatic words such as the interjection *pst*, which has the further irregularity of containing a syllabic fricative.

2) There is a moderate correlation between the degree of preferredness and the frequency in the AMC: the most preferred cluster is /ݕpr/, which has the highest token frequency and the second-highest type frequency the next cluster in the hierarchy of preferences is /ݕtr/, which has the highest type frequency and the second-highest token frequency. The other three clusters differ little in preferredness and their frequency ranks decrease in parallel for types and tokens. The reason for the mismatch between the type and token frequency differences of /ݕtr/ and /ݕpr/ is on the one hand historical, insofar as they go back to the earlier clusters / str/ and /spr/, the only word-initial triple consonant clusters reconstructed with some certainty for Proto-Indo-European (Oppermann 2004). On the other hand, the general phonotactic preference for /ݕpr/ may have had a positive impact on its token frequency. The only dispreferred cluster /skv/ is rare and occurs only in one word type (or two).

#### 3. WORD-INTERNAL POSITION

Word-internal clusters are presented only brieÀy and selectively for the following reasons: ¿rst of all, word-medial consonant clusters are much more varied and complex than initial and ¿nal ones, so that an equally extensive study would exceed space limits. Second, the corpus linguistic tools of the AMC do not permit the same procedures of analysis as for initial and ¿nal clusters. Third, the NAD calculator cannot predict preferences for the many complex clusters of more than three consonants. Fourth, internal clusters are psycholinguistically less important than peripheral clusters due to the bathtub effect, which renders the periphery of a unit better perceivable than its interior (Aitchison 2003: 138). Therefore, we limit our discussion to observations of general differences between morphonotactic and phonotactic consonant clusters and their explanations.

It holds for phonotactic clusters that word-internal syllable onsets always follow the pattern of word-initial onsets. In compounding and derivation, the syllable boundary always follows the morpheme boundary in consonant clusters.

In a word-internal position, there is a much greater variety of consonant clusters than in the peripheral positions. Phonotactic clusters that occur only word-internally have an internal syllable boundary, but they are rather few, such as /fk, dl, dv/ as in the plant name *Levkoje*, in *Adler* 'eagle', where a vowel has been lost, and *Advent* 'advent', where a morpheme boundary has been lost, and /tl/ as in the loan word *Atlas*. There are a few triconsonantal phonotactic clusters, such as /ktr, ltr, mpl, rt௬sn, stm, / as in the loan words *Spektrum, Altruismus* 'altruism', *Amplitude* 'amplitude', *Arznei* 'medicine', *Asthma*, thus hardly any with two obstruents.

The bulk of new word-internal consonant clusters are morphonotactic due to the addition of morpheme-initial to morpheme-¿nal clusters in compounding and af¿xation. This often creates morphonotactic clusters which are disallowed word-initially or word-¿nally and may contain more consonants than are permitted in the word periphery. Examples are the compound *+HUEVWSÀDQ]H* 'autumn plant' and the suf¿xation *herbstlich* 'autumnal', as well as the pre¿xation *ent-springen* 'originate'. In compounding, inter¿xation may either break up (by the inter¿x -*e*-) or increase (by the much more frequent inter¿x -*s*-) the sequence of consonants as in *Weg+e+lagerer* 'highwayman' and *König+s+schloss* 'royal castle'. The syllable boundary is always after the inter¿x, which ¿ts with the fact that the main morpheme boundary is always after, and never before, the inter¿x.

Verb pre¿xation and particle verb formation creates new word-internal consonant clusters as well. For example, the separable particle *ab*motivates the exclusively morphonotactic clusters /p-d, p-t, p-g, p-k, p-ݕ, p-t௬s, p-v/, as in *ab-drehen* 'turn off', *ab-geben* 'give in'*, ab-kommen* 'get away'*, ab-treten* 'wear out', *ab-schaffen* 'abolish', *ab-wickeln* 'unwind', *ab-ziehen* 'remove', (with the addition of longer clusters, as in *ab-streiten*  'deny'). Moreover, some of the few non-separable verbal pre¿xes create new clusters, as with *ent-*, and the earlier but now only vestigial af¿x *ant-* as in *Ant-wort* 'answer' in the parallel formation *Antlitz* 'face' the morpheme boundary was lost, and the cluster became a phonotactic one. A morpheme boundary must also be assumed after cranberry morphs, as in *6LQWÀXW* 'deluge', cf. *Flut* 'Àood'.

In contrast to many non-Germanic Indo-European languages, German af¿xation does not provoke internal vowel deletion and internal morphonotactic clusters caused by it, other than of the weakest vowel schwa. An exception is *Risiko* 'risk' ĺ adj. *risk-ant*. An epenthetic schwa is lost before a (originally word-¿nal) sonorant in derivation, such as in the derived adjectives *adl-ig* 'noble', *silbr-ig* 'silvery' (more examples in Meinhold & Stock 1980: 197–201). InÀectional af¿xation results even more rarely in subtraction, which creates morphonotactic clusters, such as in *Risk-en*, the plural of *Risiko* (in contrast to the much greater frequency in Slavic languages, Latin, Greek and other ancient Indo-European languages).

In addition, word formation creates geminate consonants which are disallowed morpheme-internally, and phonotactically, with even more marked results pseudogeminates are created by syllable- and morpheme- ¿nal obstruent devoicing, as in *ab-bauen* 'dismantle' with /p, b/.

Among clusters which are both phonotactic and morphonotactic, the productive word formation devices of compounding, verbal pre¿xation and particle verb formation may greatly outweigh the proportion of phonotactic clusters in types and tokens, e.g. for clusters starting with /-st-/, as in *west+römisch* 'Western Roman' and *aus-treiben* 'drive out' as opposed to phonotactic cases in loan words, such as *Pastrami*. This may create problems for matching phonotactic and morphonotactic clusters in psycholinguistic tests.

Only the complexity of consonant clusters, at least in terms of the number of member consonants and of the creation of new clusters which are not allowed in phonotactics, rises due to morphological operations. And in this sense, morphonotactic clusters are, on average, more marked than phonotactic clusters.

#### 4. CONCLUSIONS

#### 4.1. GENERAL RESULTS

The claim that in general morphonotactic clusters are more dispreferred than phonotactic clusters (Dressler & Dziubalska-Koáaczyk 2006: 83, Zydorowicz et al. 2016: 19–20) has been disproven for German peripheral triple consonant clusters. This removes an apparent contradiction between the claim and external psycholinguistic evidence from acquisition and processing experiments. In the ¿rst language acquisition of at least the richly inÀecting languages Polish and Lithuanian, morphonotactic clusters are acquired earlier than phonotactic clusters (Zydorowicz 2010, Kamandulytơ-Merfeldienơ 2015). And at least in certain psycholinguistic experiments (cf. the other contributions to this volume), morphonotactic clusters are processed more quickly than phonotactic ones. Therefore, the claim that morphonotactic clusters are more dispreferred than phonotactic clusters should be dropped.

This conclusion is also supported by the ease of diachronic introduction of new, i.e. morphonotactic clusters into languages that lacked them. A further ¿nding on diachrony is that we have found in German, in analogy to what has been found in other languages, examples of the lexical development of morphonotactic clusters into phonotactic ones because of morphosemantic opacity leading to the loss of morpheme boundaries, as in *Brunst* 'ardour, lust' no longer being related to its former verb base *brenn-en* 'burn', except metalinguistically (cf. Dressler et al. 2019)

Similarly to many other languages, quadruple clusters can be reduced in casual speech. Thus, the normal pronunciation of 2nd SG. *wäsch-st* '(you) wash' is >vİݕt@. These instances are fairly regular if the NAD distance is minimal, as in this case.

Probably, segmentally identical phonotactic and morphonotactic clusters have different vowel durations (cf. Plag 2014 Zimmerer, Scharinger & Reetz 2014), but it is, as yet, unclear whether these differences lie above the threshold of perceptibility. Moreover, other studies contradict these ¿ndings (see the discussion in Leykum & Moosmüller, this volume). In any event, Plag is right in objecting to linguistic models which crucially contain a Àow-chart from one submodule to another in a way which presupposes bracket erasure (also criticized in Brown & Hippisley 2012: 273). Our model of morphonotactics (Dressler & Dziubalska-Koáaczyk 2006 Dressler et al. 2010 Korecky-Kröll et al. 2014) does not presuppose such bracket erasure. This also ¿ts Slovak word-medial patterns: assuming that in a Àow-chart, inÀectional morphology follows derivational morphology, the derivational boundary in *potok* 'stream' must not be erased in order to prevent vowel deletion in Gen.SG. *po-tok-a/u*, in contrast to the deletion of the second vowel in the oblique cases of *ist-ok* 'source' and *otec* 'father' (Dressler et al. 2015).

For results regarding NAD calculations, see section 2.

#### 4.2. TYPOLOGICAL CONCLUSIONS

Phonotactic asymmetries between word-initial, word-¿nal and wordmedial positions are well known. This starts with how the universal preference for CV structures (Dziubalska-Koáaczyk 2002, 2009) is realized in the three positions and depending on whether a word is monosyllabic, disyllabic or polysyllabic.

What is interesting for the typological characterization of German is the much greater variety and complexity of word-¿nal than of word-initial clusters, e.g. in contrast to Slavic languages, Latin, Greek and other Indo-European languages. This asymmetry is also reÀected in greater type and token frequencies for word-¿nal than for word-initial obstruent clusters. Type frequency asymmetries proved to be radicalized in token frequency differences, which means that the dominant patterns are more pro¿table.

This asymmetry has two sources: on the one hand, we have the diachronic result of prehistoric or early historic major vowel deletions in German word-¿nal positions as opposed to the optimal preservation of vowels in word-initial positions. Those lost vowels of word-¿nal syllables were all unstressed, which was not the case for word-initial syllables. On the other hand, we have the more important consequence of German having many short derivational and inÀectional suf¿xes which are monoconsonantal or biconsonantal. But due to the restriction of morphological consonantism to very few consonants, already identi¿ed by Jakobson (1962: 108) for Indo-European languages, in German we ¿nd only ¿nal morphonotactic clusters ending in *-t, -s, -st, -tࣴs*. Therefore, it seems a paradox that we ¿nd a still more radical restriction for ¿nal phonotactic clusters, namely to *-t, -st* and to nouns. The reason is again diachronic: all the ¿nal phonotactic nominal triple clusters go back or seem to go back to morphonotactic clusters with a ¿nal suf¿x now ending in *-t* due to the loss of unstressed vowels that followed them or a -*t* added secondarily in early New High German as a phonological addition, as in *Werft* 'shipyard'*, Axt*  'axe'*, Obst* 'fruit', *sonst* 'otherwise', dialectal *Senft* 'mustard' (Kluge & Götze 1957 sub vocibus).

Word-internally, the contrast between exclusively morphonotactic and exclusively phonotactic triconsonantal clusters seems to be even bigger. Also, here most triconsonantal clusters with two obstruents are only morphonotactic. An among ambiguous consonant clusters, the frequencies of morphonotactic clusters seem to be higher than those of phonotactic clusters. For ef¿cient calculation of these frequency relations, new texttechnological tools must be developed.

The fact that in German peripheral positions the NAD preferences for consonant clusters are identical irrespective of whether the most peripheral consonant is included or excluded in the NAD calculations, seems to be speci¿c for Germanic languages. When we checked peripheral consonant clusters in Polish and English according to the list of clusters in Zydorowicz et al. (2016), we found that the (dis)preferredness of consonant clusters is different in Polish depending on whether the most peripheral consonants are included or excluded, but not in English.

Polish and at least Slovak among other Slavic languages (Dressler et al. 2015) differ from German and English with regard to peripheral triple consonant clusters in the following features, which appear to be relevant for the impact of the most peripheral consonant on cluster preferences when they are added to the more interior double consonant clusters:

First of all, the two Slavic languages are consonantal languages to a higher extent than the two Germanic languages. They have a much higher number of different triple consonant clusters than the two Germanic languages. For example, Polish has more than a hundred word-initial triple clusters, German only eight.

Second, Polish has many more word-initial triple morphonotactic clusters in tokens than phonotactic clusters the two Germanic languages have no word-initial morphonotactic clusters.

Third, for word-¿nal triple consonant clusters, the two Germanic languages have many more morphonotactic than phonotactic clusters, all of them due to the morphological operation of suf¿xation (i.e. addition). Polish and Slovak have only word-¿nal morphonotactic clusters created through the subtractive morphological operation of deletion of the word- ¿nal stem vowel in the genitive plural, e.g. in Pol. *zemst* vs. Nom.SG. *zemsta* 'revenge', Slov. *pomst* vs. Nom.SG. *pomsta* 'revenge'. In addition, Polish and other Slavic languages also create word-initial and wordmedial consonant clusters due to vowel deletion in inÀection and derivation, as in Pol. Gen.SG. *ps-a* from *pies* 'dog'. German has only rare word-medial cases (see section 3).

Fourth, the most peripheral German consonants in triple consonant clusters in a word-initial position are only /s/ and /ݕ/) in English only /s/), whereas Polish and Slovak also have many other consonants in this position. In word-¿nal position the most peripheral consonants in German are only /t, s, t௬s/, in English /t, d, s, z/. These consonants are also the preferred ¿nal consonants in double clusters. By contrast, many different ¿nal consonants occur in Polish and Slovak word-¿nal clusters. Thus, it seems that in the case of strong restrictions on the selection of the most peripheral consonants, the selection is natural, in the sense of not changing the (dis)preferredness of the interior consonant clusters to which they are added. This is reminiscent of those phonotactic analyses which assume for German, as for many other languages, that any third consonant in a tautosyllabic consonant cluster is extrasyllabic or extrametrical (see Wiese 1988, 2000).

This may also explain why, in the diachronic development of German, /t/ was sometimes added to a word-¿nal consonant, as in *Axt* 'axe', *Palast* 'palace', *Obst* 'fruit' from MHG *obes*, *Sekt* 'sparkling wine' from Fr. *vin sec*, dialectal Austrian German *Senft* ĸ *Senf* 'mustard'.

#### 4.3. CONSIDERATIONS ON WORKING WITH LARGE ELECTRONIC CORPORA

Working with large electronic corpora allows us to arrive at more reliable quantitative results. Here, the type-token ratio is very low for all triple clusters. For quadruple clusters we found (see section 2.1) distinct groupings within the whole range from 0.01% to 100%. Thus, the numerically most complex clusters behave differently than the less complex and more numerous triple clusters. The largest subgroup of quadruple clusters has a similar TTR distribution to the triple ones and contains the only four clusters which also include a small phonotactic minority. The more numerous groups of quadruple clusters are only morphonotactic: this again indicates the marked character of complex consonant clusters.

Our corpus-based study relied on the huge electronic corpus AMC, which may be the most complete print media corpus for any nation. This enhanced reliability for quantitative generalizations about the distribution of morphological and lexical patterns of consonant clusters. The disadvantage that such big corpora include many erroneous types of words was at least partially corrected for by manual exclusion of errors and by the restriction to types which have at least 5 tokens in the corpus. We included clusters with fewer than 5 tokens only if the cluster would otherwise not have been represented in our description. In discussions with other native speakers of German we could not think of any potential morphonotactic cluster which does not occur in the AMC.

Clearly new automatic tools should be developed for reducing the error-prone nature of large electronic corpora. More ef¿cient tools are also needed for pattern searches, as we ascertained when studying wordinternal clusters.

Even with better tools, the evidence from such an electronic corpus of written adult and adult-directed speech must be considered with caution. The AMC represents just one genre, and it has been found, at least for Modern Greek and Balto-Slavic languages (Dressler et al. 2017) that the distribution of lexical and morphological patterns may differ signi¿cantly for different genres.

#### REFERENCES


### II. Morphonotactics in speech production

H ANNAH L EYKUM <sup>1</sup> S YLVIA M OOSMÜLLER †,1

#### 1. INTRODUCTION

The interaction between morphology and phonetics is an area for which a lot of research is still needed (see e.g. Kawahara 2011). Some ¿ndings favour the view that morphology does not inÀuence speech production, while others indicate that an interaction between morphology and phonetics exists, i.e. there is an impact of morphology on the phonetic realization of speech.

One way to investigate this topic is to compare consonant combinations across word-internal morpheme boundaries (morphonotactic consonant clusters, e.g. /xt/ in German /mܤxt/ *macht* '(s/he) makes'), with consonant combinations within a single morpheme (phonotactic consonant clusters, e.g. /xt/ in German /mܤxt/ *Macht* 'power'). Some consonant combinations only exist across morpheme boundaries (purely morphonotactic clusters, e.g. /xst/ in German /mܤxst/ *machst* '(you) make'), whilst others exist nearly only within morphemes (predominantly phonotactic clusters, e.g. /mp௬f/ in German /ݕtݜݓmp௬f/ *Strumpf* 'sock'). There are, however, several consonant combinations which occur both within morphemes as well as across morpheme boundaries ((mor)phonotactic clusters) these have been studied in the present paper. For purely morphonotactic and predominantly phonotactic clusters, the cluster itself can mark the presence or absence of a morpheme boundary. However, (mor) phonotactic clusters have no boundary-signalling function. Hence, the question arises of whether morpheme boundaries within consonant clusters are marked phonetically. In order to investigate this question, the present study analyses (mor)phonotactic consonant clusters in homophonous word pairs, in word pairs of the same grammatical category, in different positions within the target words (word-¿nal and word-medial clusters) and in languages/varieties with different typological classi¿cations (word language, mixed-type language and quantifying language).2

 <sup>1</sup> Acoustics Research Institute, Austrian Academy of Sciences, Vienna.

 <sup>2</sup> Subsets of the material analysed in this paper have already been analysed for conference contributions and proceedings (Leykum, Moosmüller & Dressler 2015a

#### 2. STATE OF RESEARCH

#### 2.1. INFLUENCE OF MORPHOLOGY ON SPEECH PRODUCTION

The few studies reporting an impact of morpheme boundaries on the phonetic realization of spoken language show diverging results. Some studies indicate an impact by a morpheme boundary on speech production: several studies (Neu 1980 Guy 1991 Guy 1996 Guy, Hay & Walker 2008 Myers 1995) on word-¿nal /t, d/-deletion in American English (AE) and New Zealand English revealed that there are fewer coronal stop deletions when /d/ represents the regular past ending of conjugated verbs. Equally, for Standard Dutch, Schuppler et al. (2012) found fewer deletions of word-¿nal /t/ when it constitutes a morpheme. Concerning word-¿nal /s, z/ in AE, Seyfarth (2016) spotted longer durations for the stem and suf¿x of inÀected verbs compared to the equivalent durational measurements for uninÀected homophonous words (Pluymaekers et al. 2010). The above-mentioned ¿ndings, namely fewer reductions and fewer deletions across morpheme boundaries, can be explained by the importance of highlighting the morpheme boundary in order to enhance the comprehensibility. Other studies, however, reported an inÀuence of morphology where the direction of the effect is opposed to the aforementioned ¿ndings: Plag (2014) reported shorter durations of word-¿ nal /s/ following a morpheme boundary for Dutch. Pluymaekers et al. (2010) found an inÀuence of morphology on the phonetic realization of the Dutch suf¿x *–igheid* (/ԥxhܭLt/): the cluster /xh/ is realized with a lon ger duration when it consists of only one morpheme it is realized with a shorter duration when the suf¿x is bimorphemic (the authors explain this result by the Morphological Informativeness Hypothesis).

Contrary to these ¿ndings, other studies revealed no effect of morpheme boundaries on consonant realizations: Zimmerer, Scharinger and Reetz (2011, 2014) showed a large inÀuence of the phonological context on the realization of word-¿nal /t/ in German, but no inÀuence of the morphological status of /t/. Equally, a study investigating realizations and

Leykum, Moosmüller & Dressler 2015b Leykum & Moosmüller 2015 Leykum & Moosmüller 2016 Leykum & Moosmüller 2017 Leykum & Moosmüller 2018 Leykum & Moosmüller 2019). References concerning the corresponding papers or abstracts will be given at relevant points. However, in the present paper, a broad and detailed analysis of phonotactic and morphonotactic consonant clusters in speech production is conducted, going far beyond a summary of previous studies on subsets of the speech material.

deletions of word-¿nal /t, d/ in British English (BE) found a high impact of the surrounding phonemes on the realizations or deletions of /t, d/, but no inÀuence due to morphology (Tagliamonte & Temple 2005). Seyfarth (2016) investigated AE homophones and found, for stimuli ending in >t, d@, no inÀuence of a morpheme boundary prior to the ¿nal stop on stem duration or suf¿x duration.

Some studies investigated articulatory processes during the realization of speech segments across morpheme boundaries. Cho (2001) investigated intergestural timing across morpheme boundaries in Korean by means of electromagnetic articulography (EMA) and electropalatography (EPG). He revealed that articulation is more stable in monomorphemes and more variable across word-internal morpheme boundaries (in nonlexicalized compounds) as well as across word boundaries. However, by using combined acoustic-articulatory investigation methods (EMA, EPG, laryngography), Nakamura (2015) detected only an inÀuence of the phonological context, but no impact of morphology on the realization or deletion of word-¿nal coronal stops in British English.

#### 2.2. AIM OF THE STUDY AND HYPOTHESIS

Until now, acoustic investigations concerning the inÀuence of morpheme boundaries on consonant realizations have been limited to durational measurements in a few languages. The present study not only investigates two languages in which the phonotactic-morphonotactic distinction of consonant clusters has not yet been investigated (apart from our own studies) but also adds the investigation of intensity measurements to the analyses of durational measurements. In addition, contrary to most of the aforementioned studies, which analysed single consonants following morpheme boundaries, our study focuses on phonologically homophonous (mor)phonotactic consonant clusters.

Apart from speech production, other research areas have studied phonotactic and morphonotactic consonant clusters. The processing of morphonotactic clusters is assumed to be facilitated by the morphological function of the consonant clusters (Korecky-Kröll et al. 2014 Celata et al. 2015). In computer simulations, different cognitive representations for the two types of clusters have been revealed (Calderone et al. 2014). Concerning ¿rst language acquisition, the ¿ndings are mixed. Some studies found that children learn to produce morphonotactic consonant clusters earlier compared to phonotactic consonant clusters (Kamandulytơ 2006 Zydorowicz 2007), while others concluded that children learn both types of clusters at the same time (Freiberger 2007). The aforementioned investigations point out that in speech processing, computer simulations, and language acquisition, differences between the two types of clusters could exist. Therefore, as an extension of the Strong Morphonotactic Hypothesis (Dressler & Dziubalska-Koáaczyk 2006), which is restricted to an interaction between morphology and phonology (not phonetics), the hypothesis of the present study predicts that these differences also exist in speech production, even though the rare ¿ndings on speech production are mixed. The hypothesis is as follows:

Consonant clusters across word-internal morpheme boundaries (morphonotactic clusters) are expected to be more robust and more highlighted in speech production than consonant clusters within a morpheme (phonotactic clusters).

Since language-speci¿c differences are possible, three different language types are compared in the present study: a word language (Standard German German (SGG)), a mixed-type language (Standard Austrian German (SAG)) and a quantifying language (Standard French (FR)). These three types were chosen to investigate whether language-type-speci¿c timing characteristics have an inÀuence on the highlighting/reduction of consonant clusters. In quantifying languages, a distinction between homophonous phonotactic and morphonotactic clusters may disturb the temporal pattern of the language (Moosmüller & Brandstätter 2014). Thus, reductions of phonotactic clusters and/or lengthening of phonemes in morphonotactic clusters are expected to be less probable in quantifying languages. Therefore, with regard to the language type, it is hypothesized that durational differences between phonotactic and morphonotactic clusters will be more pronounced in SGG as compared to SAG, and the differences are expected to be greater for both varieties of German than those in FR.

#### 2.3. MATERIAL AND GENERAL METHODS

#### *Stimuli*

Comparisons of the acoustic characteristics of consonant clusters within morphemes and across word-internal morpheme boundaries are only conclusive when the clusters are phonologically homophonous. Therefore, for the present study, several (mor)phonotactic consonant clusters were chosen which occur in the same position within words, once as a phonotactic cluster, and once as a morphonotactic cluster (emerging from productive word-formation rules). Since morphonotactic consonant clusters in the word-initial position are not possible in German, only consonant clusters in a word-medial and word-¿nal position were considered for the present investigation.

The target words were nouns, verbs, and adjectives with a (mor)phonotactic consonant cluster in a word-¿nal or word-medial position. Within each word pair, the phonemes preceding and (for word-medial clusters) following the consonant cluster were kept as constant as possible to minimize the inÀuence of the phonological context on the realization of the consonant cluster. Therefore, the target words with word-¿nal consonant clusters were pairs of homophonous words, which raises the problem that we have to compare nouns and conjugated verbs. For the target words with word-medial consonant clusters, word pairs belonging to the same grammatical category were chosen. The target words are listed in Table 1. Since the word pairs were not matched for word frequency, this variable was controlled for statistically. Word frequency values were extracted from http://wortschatz.uni-leipzig.de (Quasthoff, Goldhahn & Heyer 2013).

#### *Participants*

Recordings of 16 speakers of Standard Austrian German (SAG) were made. All these SAG speakers were, as de¿ned by Moosmüller (1991), students (younger age group) or university graduates (younger and older age group) who were born and raised in Vienna, with at least one parent ful¿lling the same criteria. The speakers can be assigned to two equal age groups: the younger speakers were between 18 and 25 years old the older speakers were 45–60 years old. In both age groups, the speakers were balanced for gender.

Additionally, recordings of six younger speakers (18–25 years) of Standard French (FR) and eight speakers (18–25 years) of Standard German German (SGG) were conducted. In both groups, the speakers were balanced for gender. The speakers of FR were students or university graduates originating from the region Ìle-de-France all speakers of SGG were born and raised in the northern part of Germany (north of the Benrath line). For all participants, the same criteria were ful¿lled by at least one parent.

#### *Recordings*

The recordings were conducted in a semi-anechoic sound booth (IAC-1202A). In the recording session, after a semi-structured interview, the participants undertook several reading tasks. For one reading task, the target words were embedded in carrier phrases in a post-focal position. For this, the participants were told that they had to correct a misunderstanding concerning the addressee of an utterance. In the sentences, the pronoun or name was printed in bold, and the participants were asked to stress the pronoun/name when reading the sentences. This type of carrier phrase and the corresponding instructions were chosen to avoid stress on the target word, to enable phonetic reduction processes. The target word was always followed by the word *gesagt* 'said' to control the following phonological context for words with a word-¿nal consonant cluster. The sentence ¿nished with *glaube ich* 'I think' to avoid a sentence-¿nal lengthening starting already in the target word. The following sentences are two examples of sentences for the ¿rst reading task:

*Zu ihr? - Ich habe zu ihm "die Hast" gesagt, glaube ich.* 'To **her**? - I said to **him** "the hurry´, I think.' *Zu mir? - Ich habe zu Peter "er macht" gesagt, glaube ich.*  'To **me**? - I said to **Peter** "he makes´, I think.'

In a second speaking task, semi-spontaneous speech was elicited. In this task, the speakers had to read a given question (in which the target word was already mentioned) and answer the question by including two given words in their answer. The ¿rst given word was the target word, and the second word was given to draw attention away from the target word and to facilitate the task. Only SAG and FR speakers performed the semi-spontaneous task. Here are two examples of the semi-spontaneous speaking task:


Possible answers by the participants for the ¿rst question were: *In der Hast vergisst er seine Schlüssel* 'When he is in a hurry, he forgets his keys', and for the second example: *Nein, Herr Müller hasst Katzen* 'No, Mr. Müller hates cats'.

Additionally, some of the target words with word-¿nal consonant clusters were embedded in more natural sentences. In this, for the target words, which were verbs, the subject pronoun and the verb were separated to reduce the redundant coding of the morpheme boundary. In addition, the target word was always followed by a word starting with /С/, to reduce the impact of the phonological context. These sentences were only read by the speakers of SAG. Two examples of this second reading task are given below:

*Die Zeit misst gleich in der nächsten Runde Matthias.* 'The time will be measured in the next round by Matthias.'

*Ihr Freund hat gesagt, dass er sie nicht wirklich hasst, glaube ich.* 'Her friend said that he does not really hate her, I think.'

For both reading tasks, the sentences were put in random order and read by the participants twice within the larger recording session. The semispontaneous speaking task was conducted only once. After subtracting a few mispronounced and misread items, this resulted in a total of 2,402 analysable target words (SAG SGG FR word-medial word-¿nal).

In order to conduct the acoustic analyses, the recordings were manually segmented and annotated with STx (Noll et al. 2007) on a sentence, word and phoneme level. The duration and intensity values of the following segments were measured and semi-automatically extracted: target words, surrounding words, consonant clusters, individual consonants of the clusters, and phonemes surrounding the clusters.

The data was statistically analysed with R (R Core Team 2015) by using mixed-effects models (Bates et al. 2015). The variables subject and word were included in the models as random factors. Additionally, the following control variables were included in the models whenever they had an effect on the dependent variables: word frequency, articulation rate, /t/ deletions, stress on the target word, and pauses following the target word.

The mixed-effects models were ¿tted using a forward approach: effects were added one by one. Based on the p-value, a decision was made on whether to keep the variable or interaction in the model or to exclude it (threshold: *p* = 0.1). Where necessary, Tukey post-hoc tests with p-value adjustment were carried out.

To normalize the data, two different methods were used: on the one hand, the total duration (or mean syllable duration for word-medial clusters) or intensity of the target word were included in the statistical analyses to control statistically for any impact of speaker-speci¿c differences. On the other hand, the relative duration of each cluster or consonant was calculated by dividing the segment duration by the word duration, cluster duration or mean syllable duration (for word-medial clusters). To calculate the relative intensity, the intensity of the segment was divided by the mean word intensity or cluster intensity. The normalization method used for each analysis is indicated in the following section.


Table 1. Target words (translations are provided in Table 2 and Table 4)

#### 3. ACOUSTIC ANALYSES

#### 3.1. WORD-FINAL CLUSTERS IN SAG AND SGG

First, word-¿nal consonant clusters in homophonous word pairs realized by speakers of SAG (16 speakers) and SGG (8 young speakers) were compared (see also Leykum & Moosmüller 2015, Leykum et al. 2015a, Leykum & Moosmüller 2016). The target words were realized by all speakers twice within the carrier phrases. Moreover, the speakers of SAG conducted two additional tasks: they read sentences in which the subject pronoun and verb were separated for the bimorphemic target words (twice), and they realized the target word once in the semi-spontaneous speaking task.

The investigated target words were the following:



<sup>3</sup> Even though the orthography differs, for all word pairs, the item with a phonotactic cluster and its counterpart with a morphonotactic cluster are phonemically homophonous.

#### *Results*

#### /t/-deletion

In the word-¿nal position, /t/ was acoustically deleted in several cases. In total, /t/-deletions occurred in 11.18% of the phonotactic clusters, whereas in morphonotactic clusters, 13.64% of word-¿nal /t/ were acoustically deleted. The deletion rates did not signi¿cantly differ between the two types of clusters (*z* = -0.877, *p* = 0.381). Since the deletion rate is highly inÀuenced by the phonological context (*z* = 3.777, *p* < 0.001), only the /t/s followed by /С/ were regarded in the next step. Out of these clusters, 11.16% of the phonotactic clusters were realized without the /t/, and 10.38% of the morphonotactic clusters (here again, there is no signi¿cant difference between the two types of clusters: *z* = -0.220, *p* = 0.826).

Concerning the segmental context, the deletion rate of /t/ was highest when the preceding phoneme was the homorganic fricative /s/ as compared to the other preceding contexts (*z* = -4.139, *p* < 0.001 /t/-deletions following /s/: 16.09% in phonotactic clusters, 16.23% in morphonotactic clusters /t/-deletions following other phonemes: 3.36% in phonotactic clusters, 7.66% in morphonotactic clusters, see Figure 1).

Figure 1. Percentages of /t/-realizations and /t/-deletions

Relative duration of the cluster

The ¿tted mixed-effects models revealed the following signi¿cant effects for the relative duration of the entire cluster (in % of word duration): a type-of-cluster\*speaking-task interaction (*F*(2,1383) = 20.800, *p* < 0.001), a type-of-cluster\*/t/-realization interaction (*t*(1398) = 3.210, *p* = 0.001), a gender\*variety/age interaction (*F*(2,18) = 3.940, *p* = 0.037), a main effect of articulation rate (*t*(1395) = -10.670, *p* < 0.001), and a main effect of the cluster (*F*(6,7) = 47.035, *p* < 0.001). Post-hoc analyses showed a signi¿cant type-of-cluster difference for the type-of-cluster\*speakingtask interaction in the additional speaking task only, where subject pronoun and conjugated verb were separated. Here, the phonotactic clusters were shorter compared to the morphonotactic clusters (*t*(24) = 3.629, *p* = 0.015). The type-of-cluster\*/t/-realization interaction revealed shorter durations for both types of clusters when the word-¿nal /t/ was deleted. This effect was slightly larger for morphonotactic clusters (phonotactic: *t*(1408) = 6.723, *p* < 0.001 morphonotactic: *t*(1411) = 11.539, *p* < 0.001, see Figure 2). A closer look at the gender\*variety/age interaction revealed that the clusters of the elder female SAG-speakers were shorter compared to all other groups of speakers (see Table 3).


Table 3. Gender\*variety/age-interaction (post-hoc tests)

Figure 2. Interaction type-of-cluster\*/t/-realization

#### Relative duration of /t/

Concerning the duration of /t/ in relation to the duration of the entire word (% of word duration), the statistical analyses revealed a signi¿cant three-way interaction between the task, the word frequency and the type of cluster (*F*(2,1194) = 5.291, *p* = 0.005), and main effects of articulation rate (*t*(1114) = 2.641, *p* = 0.008), variety/age (tendency: *F*(2,20) = 2.909, *p* = 0.077 elder SAG < younger SAG < SGG speakers), gender (tendency: *t*(19) = 1.994, *p* = 0.061 female < male speakers), and cluster (*F*(6,7) = 17.111, *p* < 0.001). Post-hoc analyses showed that only for the speaking task with separated pronoun and verb was it the case that the higher the word frequency, the more the two types of clusters differed in their length with /t/ being relatively longer in morphonotactic clusters (see Figure 3).

Figure 3. Three-way interaction: task\*word-frequency\*type-of-cluster

The effect of the articulation rate (longer relative duration of /t/ for higher articulation rates) emerged due to an articulation rate-induced shortening of the entire word (especially the vowel: main effect of articulation rate (*t*(1230) = -8.621, *p* < 0.001).

#### Relative intensity of the clusters

The ¿tted mixed-effects model showed that the relative intensity of the clusters (in % of word intensity) is signi¿cantly inÀuenced by an interaction between the task and /t/-realization (*F*(2,1406) = 4.114, *p* = 0.016), a word-frequency\*articulation-rate interaction (*t*(1397) = -2.780, *p* = 0.005), a main effect of gender (*t*(22) = 3.562, *p* = 0.002), and a main effect of cluster (*F*(6,9) = 130, *p* < 0.001). No inÀuence of the type of cluster was found (*p* = 0.804). A post-hoc test concerning the task\*/t/ realization interaction revealed signi¿cantly lower relative cluster intensities of clusters with realized ¿nal /t/ compared to the clusters with /t/-deletion for both reading tasks (carrier phrases: *t*(1409) = -3.215, *p* = 0.017 second reading task: *t*(1401) = -3.659, *p* = 0.004), but not for the semi-spontaneous speaking task (*t*(1402) = -0.237, *p* = 0.999).

#### Relative intensity of /t/

The relative intensity of /t/ (in % of word intensity) is inÀuenced by an interaction between the type of cluster and the task (*F*(2,1122) = 6.657, *p* = 0.001), an interaction of task and gender (*F*(2,1204) = 9.462, *p* < 0.001), a main effect of articulation rate (*t*(1213) = 2.989, *p* = 0.003), and a main effect of cluster (*F*(6,12) = 13.959, *p* < 0.001). Post-hoc analyses revealed that in none of the speaking tasks did phonotactic and morphonotactic clusters differ in their relative intensity of /t/. However, the relative intensity of /t/ was signi¿cantly lower in the speaking task with a separated subject pronoun and verb as compared to /t/ in target words embedded in the carrier phrases. This effect was slightly larger for the phonotactic clusters (morphonotactic clusters: *t*(1204) = 4.058, *p* < 0.001 phonotactic clusters: *t*(1147) = 7.607, *p* < 0.001, see Figure 4).

Figure 4. Interaction type-of-cluster\*task

#### *'LVFXVVLRQRIZRUG¿QDOFRQVRQDQWFOXVWHUV*

Regarding the number of acoustic deletions of /t/ and the relative intensity of the cluster, no signi¿cant difference between phonotactic and morphonotactic clusters exists. For the other investigated variables, interactions including an effect of the type of cluster reached signi¿cance. However, no main effects of the type of cluster were found. The interactions were, with one exception, all interactions with the speaking task. The additional speaking task was designed to test whether the redundant coding of the information given by the conjugational morpheme reduces the importance of a highlighting of the morpheme boundary, which could result in less highlighting of morphonotactic clusters. Therefore, opposing effects could explain the lack of a difference between the phonetic realization of phonotactic and morphonotactic clusters in the other speaking tasks. However, the effects of the present study could not be interpreted as evidence for this hypothesis, since the additional speaking task involved a highly unnatural wording for some of the sentences, which in itself results in a higher articulation accuracy. The target words containing phonotactic clusters were also embedded in the sentences. For the nouns, however, the context was more natural, possibly resulting in a less accurate articulation. In addition, some of the target words with a morphonotactic cluster were in a phrase-¿nal position, resulting in phrase-¿nal lengthening of the target word.

With regard to the relative duration of the cluster, the type-ofcluster\*/t/-realization interaction showing a slightly larger difference between clusters with and without ¿nal /t/ for morphonotactic clusters compared to phonotactic clusters seems to be a random result, which possibly emerged due to differences between the clusters themselves and the low number of clusters with /t/-deletion.

The effects of the type of cluster emerging in the analyses can easily be explained by the unnatural wording, and by differences in the positions of the target words within the sentences in the second reading task. However, when investigating word-¿nal consonant clusters in German homophones, the lack of an effect of a cluster-internal morpheme boundary on speech production could not be interpreted as evidence for the nonexistence of an inÀuence of the morpheme boundary on the realization of morphonotactic consonant clusters. Within each word pair, the stimuli not only differed in being monomorphemic or bimorphemic, but also in the grammatical category to which the target words belong.

#### 3.2. WORD-MEDIAL CLUSTERS IN SAG, SGG AND FR

In a further step, the (mor)phonotactic clusters in word-medial position were investigated (see Leykum & Moosmüller (2019) for wordmedial clusters in SAG Leykum & Moosmüller (2017) for a comparison of the three languages/varieties). Here, in most cases, the grammatical category was identical for both stimuli within each word pair. The target words are listed in Table 4.


Table 4. Target words with word-medial consonant cluster (p = phonotactic, m = morphonotactic the word pairs not matched for grammatical category are shaded)


The target words were realized twice by all 30 speakers (16 SAG, 8 SGG, 6 FR) within the carrier phrases. In addition, the speakers of SAG and FR conducted the semi-spontaneous speaking task.

#### *Results*

#### Absolute cluster duration

The statistical analyses (with relative syllable duration as the control variable) showed a signi¿cant interaction between language/variety and articulation rate (*F*(3,753)=9.863, p<0.001), an interaction between articulation rate and cluster (*F*(6,918) = 11.944, *p* < 0.001), and a main effect of the speaking task (*t*(912) = -3.018, *p* = 0.003 shorter clusters in the semi-spontaneous speaking task). Concerning the language/ variety\*articulation-rate interaction, a decrease in the cluster duration with increasing articulation rate was slightly steeper for the speakers of SGG compared to the other groups of speakers. The articulation-rate\*cluster interaction emerged because the duration of the clusters /ƾkt/ and /ksp/ was more affected by the articulation rate than the other clusters. The duration of the cluster /sk/ was least inÀuenced by the articulation rate. A morpheme boundary within the clusters had no inÀuence on the duration of the clusters (*p* = 0.864).

Since the material is not well balanced, another mixed-effects model was ¿tted for a subset of the data. Here, only the stimuli embedded in the carrier phrases were analysed. Furthermore, the French items and the word pairs *Paste-passte* 'paste-¿tted', *Küste-küsste* 'coast-kissed' and *Diskothek-diskontinuierlich* 'discotheque-discontinuous' were excluded so that only word pairs matched for their grammatical category were used, to enhance the comparability. For this subset of data, a three-way interaction between gender, type of cluster and grammatical category emerged (*F*(1,463) = 7.398, *p* = 0.007). Post-hoc analyses showed longer durations of phonotactic clusters for both genders for adjectives. Concerning nouns, no effect occurred for female speakers. For male speakers, however, the phonotactic clusters were shorter compared to the clusters of female speakers and compared to male speakers producing morphonotactic clusters (see Figure 5). An articulation-rate\*cluster interaction (*F*(4,470) = 5.970, *p* < 0.001) revealed the same effects as the analyses of the entire dataset (see above). In addition, an effect of variety/age (*F*(2,12) = 21.320, *p* < 0.001) revealed that speakers of SGG produced the clusters with longer durations compared to both age groups of speakers of SAG.

Figure 5. Interaction gender\*type-of-cluster\*grammatical category

Relative cluster duration (in % of mean syllable duration)

When normalizing the cluster duration by using the mean syllable duration, the ¿tted mixed-effects model showed the following signi¿cant effects: a task\*articulation-rate interaction (*F*(1,878) = 9.315, *p* = 0.002), a main effect of language/variety (*F*(3,24) = 7.962, *p* < 0.001), and a main effect of cluster (*F*(6,21) = 3.210, *p* = 0.021). A morpheme boundary within the cluster had no effect on the relative duration of word-medial clusters (*p* = 0.461).

Duration of the cluster-¿nal consonant relative to the cluster duration

When dividing the clusters at the position of the morpheme boundary of the morphonotactic clusters (/xt/ ĺ /xt/, /ƾkt/ ĺ /ƾkt/, /sm/ ĺ /sm/, /sl/ ĺ /sl/, /sk/ ĺ /sk/, /ksp/ ĺ /ksp/) and dividing the duration of the second part of the cluster by the total cluster duration, the relative duration of the cluster-¿nal consonant is calculated. The statistical analyses revealed that the relative duration of the cluster-¿nal consonant is inÀuenced by an interaction between articulation rate, gender and task (*F*(1,868) = 4.014, *p* = 0.045): for target words in carrier phrases, the relative duration of the cluster-¿nal consonant is inÀuenced by the articulation rate only for female speakers. In the semi-spontaneous speaking task, the articulation rate does not inÀuence the duration of the cluster- ¿nal consonant. In addition, a main effect of cluster (*F*(6,20) = 7.800, *p* < 0.001) occurred. The type of cluster had no inÀuence on the relative duration of the cluster-¿nal consonant (*p* = 0.307). When reducing the data to a subset of the stimuli which were balanced in terms of the grammatical category, a tendency for an effect of the grammatical category (*F*(2,19) = 4.723, *p* = 0.059) showed longer durations for the cluster-¿nal consonant in adjectives compared to nouns. However, this is not a global effect of differences between nouns and adjectives, but more likely an effect arising due to differences between the different word pairs.

#### Absolute intensity of the cluster

The absolute intensity of the investigated word-medial consonant clusters is inÀuenced by a main effect of gender (*t*(22) = 2.929, *p* = 0.008), with higher intensity of the clusters realized by male speakers. In addition, a main effect of cluster (*F*(6,22) = 20.900, *p* < 0.001) occurred, and a main effect of speaking task (*t*(925) = 3.890, *p* < 0.001), with higher intensities in the semi-spontaneous speaking task. The type of cluster had no signi¿cant inÀuence on the absolute intensity of the clusters (*p* = 0.125).

#### Relative intensity of the cluster (relative to the intensity of the vowel preceding the cluster)

When normalizing the cluster intensity by calculating the intensity in relation to the intensity of the vowel preceding the cluster, besides an effect of cluster (*F*(6,19) = 9.128, *p* < 0.001), a signi¿cant three-way interaction between word-frequency\*articulation-rate\*language/variety (*F*(3,920) = 5.794, *p* < 0.001) occurred. In this, for the speakers of SGG, the relative intensity of the clusters decreased with increasing word frequency for higher articulation rates. The other groups of speakers did not show such an effect. Here again, the type of cluster did not inÀuence the intensity values of the clusters (*p* = 0.133).

Intensity of the cluster-¿nal consonant relative to the cluster intensity

The relative intensity of the cluster-¿nal consonant was calculated by dividing the intensity of the consonant by the intensity of the cluster. The ¿tted mixed-effects model resulted in a main effect of language/variety (*F*(3,24) = 5.039, *p* = 0.008), a main effect of cluster (*F*(6,21) = 87.051, *p* < 0.001), and a main effect of task (*t*(938) = 3.727, *p* < 0.001). Post-hoc analyses showed that the relative intensity of the cluster-¿nal consonant is lower in FR compared to SAG and SGG. In addition, it is higher in the semi-spontaneous speaking task. The effect of language/variety emerged due to language- and item-speci¿c word-stress differences. The type of cluster had no inÀuence on the relative intensity of the cluster-¿nal consonant (*p* = 0.118).

#### *Discussion of word-medial consonant clusters*

One advantage of investigating word-medial clusters is the possibility to compare word pairs of the same grammatical category, as well as to compare French and German consonant clusters, as some consonant clusters exist in both languages, both within morphemes and across morpheme boundaries in a word-medial position. However, there are also several disadvantages: it is not possible to control the phonological context as much as for clusters in homophones the target words are more diverse, not only in terms of the phonemes surrounding the consonant clusters, but also in terms of the exact position of the consonant clusters within the words, and, most importantly, in terms of the position of the word stress within the German target words.

In order to investigate the impact of the grammatical category of the target words on the realization of the consonant clusters, the dataset was restricted to a subset including only word pairs where both items within each pair belong to an identical grammatical category. These analyses revealed an effect from the grammatical category in only two of the ¿tted models: concerning the absolute cluster duration and concerning the relative duration of the cluster-¿nal consonant. With regard to the absolute duration of the cluster, the grammatical category was part of a three-way interaction, revealing a longer duration of phonotactic clusters for adjectives for all speakers and a shorter duration of the phonotactic clusters for nouns when realized by male speakers. Furthermore, the relative duration of the cluster-¿nal consonant was longer in adjectives compared to nouns. Since there were no interactions between the grammatical category and the type of cluster besides the three-way interaction, it could be concluded that for adjectives and nouns, a possible inÀuence of the grammatical category could be ruled out as a factor which could mask effects of a morpheme boundary on the realization of word-medial consonant clusters. In addition, concerning the three-way interaction affecting the absolute cluster duration, the effect on the adjectives is in the opposite direction to the hypothesis: in adjectives, phonotactic clusters were longer than morphonotactic clusters. Concerning nouns, the phonotactic clusters of male speakers were shorter compared to the other clusters, which could be interpreted as less reduction of the morphonotactic clusters by male speakers, compared to a low level of reductions by female speakers, irrespective of the presence or absence of a morpheme boundary.

Concerning all other investigated variables, no effect of a consonant cluster internal morpheme boundary on the realization of the word-medial clusters was detectable.

#### 4. GENERAL DISCUSSION

Previous studies investigating the inÀuence of morpheme boundaries on speech production came to different results. Some studies revealed an effect indicating an acoustic highlighting of the morpheme boundary by lengthening phonemes across morpheme boundaries. Other studies found no effect of a morpheme boundary, or even indicated results with an effect in the opposite direction. Likewise, in the present study, for some variables an effect of the type of cluster emerged, either in the expected direction or in the opposite direction. However, most analyses did not ¿nd any effect of the morpheme boundary.

Since the effects of an inÀuence by the morpheme boundary can all be easily explained by other interfering variables, the present study is not able to give any evidence for an impact of morphology on speech production. However, the absence of any effects does not necessarily imply that no inÀuence of morpheme boundaries on the realization of consonant clusters exists.

The possibility that the highly redundant coding of the information of the morpheme boundary in conjugated verbs with word-¿nal morphonotactic consonant clusters leads to a less accurate articulation cannot be ruled out. Opposite effects caused by the morpheme boundary and the high redundancy are still possible. Due to the unnatural wording and the lack of a possibility to match the target words in terms of the position within the sentences, the additional reading task did not provide conclusive ¿ndings.

Another factor linked to the redundancy, described by Hanique and Ernestus (2012: 175), is the word-information load: "The less a segment contributes to distinguishing the complete word from other words, the more it may be reduced´. Equally, the degree of morphological decomposability could constitute a factor inÀuencing whether morphonotactic consonant clusters are treated differently from phonotactic clusters in speech production.

The present ¿ndings were able to rule out language- and/or varietyspeci¿c timing characteristics as a factor inhibiting an acoustic differentiation between phonotactic and morphonotactic clusters (see also Leykum & Moosmüller 2017). However, besides language-speci¿c timing characteristics, other language-speci¿c differences could exist. The investigated languages share a low morphological richness, raising the question of whether the morphological richness of a language determines whether phonotactic and morphonotactic clusters behave the same or not. It is possible that in morphologically richer languages, the information about the morpheme boundary is more important to ensure intelligibility. A fact supporting this hypothesis is research on ¿rst language acquisition. It was shown that children acquiring Austrian German learn both types of clusters at the same time (Freiberger 2007), whereas, in the ¿rst language acquisition of the morphologically richer languages Polish and Lithuanian, children learn to produce morphonotactic consonant clusters correctly prior to phonotactic consonant clusters (Kamandulytơ 2006 Zydorowicz 2007).

#### 5. CONCLUSION

Combining the present ¿ndings with analyses of the subsegmental parts of /t/ in word-¿nal clusters (Leykum & Moosmüller 2015 Leykum et al. 2015b Leykum & Moosmüller 2018), conducted on the same material, none of the analyses could prove that morphonotactic consonant clusters are more highlighted or less susceptible to reduction processes.

Yet, the present analyses do not prove that phonotactic and morphonotactic consonant clusters are identical in their phonetic realization, since statistically insigni¿cant results do not imply that no effect exists. However, the fact that quite a lot of analyses were conducted on a relatively large dataset, all showing no or no stable effect of the morpheme boundary on speech production, leads us to the conclusion that it is very unlikely that speakers realize morphonotactic consonant clusters in German differently because of the morpheme boundary.

#### ACKNOWLEDGEMENTS

The current investigation was undertaken within the project I 1394- G23 'Human Behaviour and Machine Simulation in the Processing of (Mor)Phonotactics', funded by the FWF Austrian Science Fund and the project 'Die österreichische Standardaussprache Wiens in Kontakt mit der deutschen Standardaussprache', funded by Kultur Wien.

#### REFERENCES


*English*. International Workshop on Language Variation and Linguistic, Nijmegen.


### III. The acquisition and processing of (mor)phonotactic consonant clusters in German

S ABINE S OMMER - L OLEI 1,2 K ATHARINA K ORECKY - K RÖLL <sup>3</sup> M ARKUS C HRISTINER 2,4 W OLFGANG U. D RESSLER <sup>1</sup>

#### 1. INTRODUCTION

The aims of this psycholinguistic contribution are to show how morphonotactic and phonotactic German consonant clusters differ 1) in early spontaneous ¿rst language acquisition and 2) (more importantly) in processing experiments, under which conditions one of the two cluster types is acquired earlier and processed more accurately and with shorter latency, as well as what the impact of frequency, familiarity and foreignness on the processing of simple words, compounds and morphological derivatives is.

Phonotactics and morphotactics interact in the area of morphonotactics (Dressler & Dziubalska-Koáaczyk 2006). As will be shown in the following sections of this chapter, consonant clusters with and without morpheme boundaries are a good testing ground for the investigation of morphonotactics. Whereas phonotactic consonant clusters are found in simple word stems (e.g. German *Wicht* 'wight'), morphonotactic clusters cross morpheme boundaries in inÀected, derived or compound words (e.g. German *(er/sie/es) mach-t* '(s/he/it) makes', *Reich-tum* 'richness', *Pech+tag* 'off-day'). Sometimes morphonotactic clusters are entirely new consonant clusters that can only be found in morphologically complex words (e.g. German *ruf-st* '(you) call'), but sometimes they may also be homophonous with existing phonotactic clusters (e.g. German morphonotactic *lob-st* '(you) praise' vs. phonotactic *Obst* 'fruit'). The question that arises is whether the interaction between phonotactics and morphonotactics facilitates the processing and acquisition of morpho-

 <sup>1</sup> Austrian Centre for Digital Humanities and Cultural Heritage (ACDH-CH) of the Austrian Academy of Sciences, Vienna & University of Vienna.

 <sup>2</sup> Recipient of a DOC-team fellowship of the Austrian Academy of Sciences.

 <sup>3</sup> Department of German Studies of the University of Vienna.

 <sup>4</sup> University of Vienna.

notactic clusters or makes it more dif¿cult, or if it makes no difference whether a word contains a phonotactic or morphonotactic cluster. In contrast to Slavic languages, German has only very few examples of morphonotactic clusters which arise through vowel deletion, and all clusters involved are homophonous with phonotactic clusters, e.g. in the adjective *risk-ant* 'risky' derived from the noun *Risiko* 'risk'. For a corpus linguistic description of complex German consonant clusters, see Dressler and Kononenko in this volume.

Previous research (Zydorowicz 2007, 2009, 2010 Kamandulytơ 2006 Freiberger 2007, 2014 Korecky-Kröll & Dressler 2015 Zydorowicz et al. 2015) has shown no greater acquisitional dif¿culties for morphonotactic clusters compared to phonotactic clusters in typically developing children. Evidence for the facilitation of acquiring morphonotactic clusters has been found for Polish and Lithuanian, but not for German. Evidence for the ease of processing of German has been divergent (Korecky-Kröll et al. 2014 Celata et al. 2015 Freiberger et al. 2015).

The results of the above investigations have led to the hypothesis (Zydorowicz et al. 2015) that processing ease correlates with the morphological richness of the respective language. Morphological richness is de¿ned as the amount of productive morphology (Dressler 1999, 2004), and it has been demonstrated by Xanthos et al. (2011) that its presence in child-directed speech supports children in developing and speeding up the acquisition of morphology. For processing, our corresponding claim refers to Libben's (2014) principle of maximum opportunity.

German morphology is relatively poor in inÀection, which probably explains why we have found no facilitation of the acquisition and processing of morphonotactic clusters, but it is rich in compounding and several areas of derivational morphology.

We examined the inÀuence of (mor)phonotactics in visual word recognition in lexical access and word identi¿cation (Korecky-Kröll et al. 2014 Freiberger et al. 2015 Korecky-Kröll et al. 2016 Sommer-Lolei, Korecky-Kröll & Dressler 2017) in order to test the Strong Morphonotactic Hypothesis (Dressler & Dziubalska-Koáaczyk 2006), which claims that morphonotactic consonant clusters facilitate processing and acquisition and lead to higher accuracy because of the signi¿cant morphological information those clusters carry. We will test this hypothesis for German, to see whether it will be supported because morphonotactics facilitates word recognition in lexical processing. Or will processing rather be impeded due to the higher processing cost of inÀected word forms vs. base forms? Furthermore, we hypothesize ¿rst that the Strong Morphonotactic Hypothesis may not be equally true for all languages and may depend on the morphological richness of a language or its respective morphological subsystem second, that factors other than the often-overestimated frequency may play a major role in processing.

#### 2. CONSONANT CLUSTERS IN FIRST LANGUAGE ACQUISITION

Freiberger (2014) investigated the early acquisition of morphonotactic and phonotactic consonant clusters in three typically developing monolingual toddlers (1 boy, 2 girls) from high SES backgrounds (HSES), who were acquiring Standard Austrian German. They were recorded longitudinally at their homes in Vienna, from 1 year 7 months (17) up to 30 years. She divided this period into three5 developmental phases (phase 1: 17–20, phase 2: 21–26, phase 3: 27–30). All of these data were transcribed and coded by using an adapted German version of CHILDES (cf. MacWhinney 2000). Freiberger analysed 180 minutes per child and phase and investigated all correctly and incorrectly produced consonant clusters in the spontaneous speech of these mother-child interactions in word-initial, -medial and -¿nal position.

The results show, as expected, that all children make signi¿cant progress from the ¿rst to the third phase, and that the children have more dif¿culty with word-initial clusters, which are all phonotactic in German, than with word-¿nal clusters. This can be attributed to the fact that ¿nal elements are perceived best, well known as the *recency effect* (cf. Eysenck & Keane 2000). In this study no signi¿cant morpheme boundary effect was found, which reveals that there is no difference between morphonotactic and phonotactic consonant clusters in the early acquisition of German.

In another investigation by Korecky-Kröll et al. (2016) the spontaneous speech of parent-child interactions of 29 typically developing monolingual and Standard Austrian German-acquiring children from different SES backgrounds (HSES vs. lower SES (LSES)) (7 boys and 8 girls of HSES, 8 boys and 6 girls of LSES) of the INPUT project6 was analysed. The children were video and audio recorded in everyday situations with their main caretakers in their homes in Vienna at four data points at the

 <sup>5</sup> Except for the boy, whose audio recordings already started at age 13, and therefore have an additional fourth phase (phase 0: 13–16), cf. (Freiberger 2014: 7).

 <sup>6</sup> INPUT project: 'Investigating Parental and Other Caretakers' Utterances to Kindergarten Children' (SSH11-027) funded by the Wiener Wissenschafts-, Forschungsund Technologiefonds (WWTF Vienna Science and Technology Fund).

mean ages 31, 34, 44 and 48. The most interactive 30 minutes per recording were transcribed, coded and analysed, which is equivalent to 58 hours of spontaneous speech material. The correct use of Cst (consonant st) clusters, in a medial and ¿nal position was examined, in correlation to the effects of the data point, SES and morpheme boundary.

The results demonstrated that the children made signi¿cant progress from the third to the fourth data point and also showed a signi¿cant effect of the socio-economic status, since LSES children showed lower accuracy. Similarly to Freiberger's investigation (see above), there was no signi¿cant inÀuence from the morpheme boundary.

The overall results show that there are no differences in ¿rst language acquisition of German between morphonotactic and phonotactic consonant clusters, neither in very young (Freiberger 2014) nor in older children (Korecky-Kröll et al. 2016). This is similar to the ¿ndings of Kirk and Demuth (2005) for English, which is a still more weakly in-Àecting language than German, and unlike the results presented in studies of strong inÀecting, morphologically richer languages, such as Polish and Lithuanian (cf. Zydorowicz 2007, 2009, 2010 Kamandulytơ 2006 Zydorowicz et al. 2015), in which the authors showed some differences in favour of morphonotactic clusters. This difference might be due to the fact that most of the investigated German morphonotactic clusters occurred in inÀected word forms. Still, Freiberger (2007: 20) demonstrated that the presence of a morpheme boundary does not render word recognition and production in child speech (CS) more dif¿cult.

There are two open issues which we would like to address. First, whether or not the preference for morphonotactic clusters in certain languages or their morphological subsystems depends on the degree of morphological richness and second, whether the acquisition of distinct consonant patterns truly correlates with the presence of a morpheme boundary, or whether, instead, the morpheme boundary itself is the important factor, regardless of the existence of a consonant cluster.

#### 3. CONSONANT CLUSTERS IN PROCESSING

#### 3.1. PREVIOUS EXPERIMENTS

To test the Strong Morphonotactic Hypothesis for German in adult and adolescent processing, four experiments were performed, which led to divergent results. A study by Korecky-Kröll et al. (2014) used a visual sequence targeting experiment in which 84 native Standard Austrian German-speaking participants had to detect whether a given stimulus contained either one letter (T) or a sequence of two letters (ST, AN). The experiment was divided into four different tasks: ¿nd T (in words containing ST) ¿nd T (in a word containing only T, but not ST) ¿nd ST and ¿nd AN. In the ¿rst two parts participants had to ¿nd T a) in a consonant+st (Cst) sequence (e.g. *Obst* 'fruit' vs. *lob-st* 'you praise') and b) in a consonant+t (Ct) combination (e.g. *Karton* 'cardboard' vs. *dankte* 'thanked', *Lift* 'elevator' vs. *pack-t* '(s/he/it) packs'). In the third task, participants had to detect ST in a word presented visually (e.g. *Stempel* 'stamp' vs. *brav-ste* (good-SUP) 'best' superlative) and in the fourth, they had to ¿nd AN (e.g. *Fasan* 'pheasant' vs. *vor-an* 'ahead'). Half of the stimuli did not contain the respective sequence, whereas the other half was divided into subgroups that contained the sequence in different positions. For example, in the ¿nd T (in T) experiment, 96 stimuli did not contain a /t/, whereas 16 stimuli contained a /t/ in the initial position, 16 in a medial position without morpheme boundary, 16 in a medial position with morpheme boundary, 16 in the ¿nal position without morpheme boundary, 16 in the ¿nal position with morpheme boundary as a default and 16 in the ¿nal position with obligatory morpheme boundary (after a diphthong or long vowel).

Besides other factors, the authors reported a signi¿cant facilitating impact from the morpheme boundary only in terms of reaction times (RT), but not of accuracy (ACC), which indicates processing on a sublexical level.

Celata et al. (2015) examined the processing of morphonotactic and phonotactic clusters in German in two different experiments. First, they performed a split cluster task, in which 38 adults (29 years and older) and 26 adolescents (11 to 15 years of age), all native speakers of Standard Austrian German, had to create novel diminutives and attenuative forms by inserting the vowel /i/ between two consonants (CCĺCiC). They were presented with 14 monosyllabic test items ending in a Cst-cluster, half containing a morphonotactic cluster, half a phonotactic one, and 56 ¿ller words with differing clusters. Thus, they had to transform, for example, a Cst-cluster /nst/ into /nist/ or /nsit/ in phonotactic *Dunst* 'mist' (ĺ \**Dunist* or \**Dunsit*) and in morphonotactic *kenn-st* 'you know' (ĺ *\*kennist*  or \**kennsit*). The result showed an overall preference for /ist/ over /sit/ responses and, only within the adult group, high accuracies regardless of the type of cluster. For the group of adolescents, the morphonotactic clusters were signi¿cantly easier to split than the phonotactic ones, whereas the adult group only showed a trend in favour of the stimuli containing a morpheme boundary. This demonstrates that the presence of a morpheme boundary tends to be helpful in connection with a word modifying task.

Second, a fragment monitoring task was conducted with 28 adolescents (aged 12 to 16) and 41 adults (aged 25 to 59), again all native speakers of Standard Austrian German. Participants were visually presented with a string in capital letters in the centre of the screen, while hearing words over headphones. They had to decide as quickly as possible whether the auditorily perceived word contained the string presented on the screen. In this experiment 30 German words, with a Cst-cluster in the mid- or ¿nal position, and 75 ¿ller words, containing other clusters, served as stimuli (e.g. the Cst-cluster /nst/ in phonotactic *Kunst* 'art' vs. morphonotactic *kenn-st* 'you know').

The results showed that, overall, adults were signi¿cantly more accurate compared to the group of adolescents, which shows again that the acquisition of phonology and morphonology is not completely ¿nished in adolescence. Despite our expectations, the presence of a morpheme boundary had no effect on adult accuracy, although it did have an effect in the latency of the younger group, with phonotactic patterns being detected faster than morphonotactic ones. Also, the adolescents made more errors in morphonotactic items. The authors point to the signi¿cant impact of frequency in this case, since high-frequency items were more often not only judged correctly, but also faster than low-frequency ones in the adult group, whereas this frequency difference did not show as much of an effect in adolescents. This points to different processing strategies in such a recognition task across ages.

Since the above experiments focused only on the sublexical level, as either morphemes, parts of morphemes or clusters that contained morphemes were processed, there was a strong need to investigate adult language processing on a higher level of language awareness, namely the lexical level.

#### 3.2. EXPERIMENTS ON LEXICAL ACCESS

In order to investigate the inÀuence of (mor)phonotactics in lexical processing and to see whether morphonotactics facilitates or impedes visual word recognition, various experiments were performed within our research project.

The ¿ndings of all new experiments are ¿rst summarized and only afterwards are they accurately described. To investigate the processing of morphonotactic and phonotactic consonant clusters in whole word recognition, we conducted a progressive demasking task (PDT) and four different lexical decision tasks (LDT):


Previous studies on whole-word processing have demonstrated a higher processing cost for inÀected word forms as opposed to monomorphemic words (e.g. Finnish, cf. Laine et al. 1999) or for inÀected base forms (e.g. German, cf. Günther 1988). Therefore, the ¿rst experiments (LDT 1 and PDT) were created by Freiberger et al. (2015) to test whether native speakers of German show similar tendencies and whether they are sensitive to the presence of a morpheme boundary within a consonant cluster (e.g. /gt/, /bl/, /mt/) when the item is an inÀectional form, or whether the morpheme boundary would not delay processing.

Both experiments (LDT 1 and PDT) contained inÀected word forms which are non-citation forms. This suggested that it might be problematic to compare a citation form to a non-citation form. Therefore, Korecky-Kröll et al. (2016) performed another lexical decision task (LDT 2) in order to clarify this issue. Instead of comparing the processing of inÀectional forms to monomorphemic words, compounds, which are citation forms like monomorphemic words, were used. Compounding is also a morphologically rich domain of German, richer than inÀection, which is another issue that needed to be addressed. We hypothesized that compounds would be processed not only more accurately but also much faster than monomorphemic words.

In order to cover all categories of word formation, Sommer-Lolei et al. (2017) also conducted an experiment with derived words compared to monomorphemic words (LDT 3). Derivations, like compounds, are citation forms, but are expected to be harder to process, because compounds are morphosemantically more transparent than comparable derivations.

In our ¿nal lexical decision task 4 we combined the two previous experiments (LDT 2 and 3) into one, considering two newly introduced variables, namely familiarity and foreignness (see also 3.7).

#### 3.3. METHODOLOGY

To investigate lexical access, four lexical decision tasks were performed. The experiments (LDT 1, 2 and 3) were designed using the behavioural research software E-Prime 3.0 (Psychology Software Tools, Pittsburgh, PA) and were all carried out on the same Windows laptop (Freiberger et al. 2015 Korecky-Kröll et al. 2016 Sommer-Lolei et al. 2017). LDT 4 was an online experiment, programmed and provided by URL: https://quest.christiner.at/. Due to the high number of participants that were all tested in one day, we tested on several Windows computers simultaneously. All of the test items were presented to the participants visually, capitalized in the centre of the screen, following a ¿xation cross. Participants had to decide as accurately and quickly as possible whether the presented string was an existing German word or not. The relevant keys were marked with a green sticker for an af¿rmative response, and a red sticker for a negative response. All reaction times measured in LDT 1, 2 and 3 that were below 300 ms or exceeded 2500 ms were excluded from the analysis. For LDT 4 reaction times were measured but considered as not reliable due to the high number of different computers used at the same time and intermittent connectivity issues.

The progressive demasking task (PDT) was used to test word identi- ¿cation. The experiment was designed using the PDT software (Dufau, Stevens & Grainger 2008). The cut-off values in this experiment were < 300 ms and > 3500 ms. Participants had to identify on their computer screen a slowly demasking stimulus and to con¿rm identi¿cation of it as soon as possible by pressing a key. As a result, the word disappeared from the screen and they had to type the previously identi¿ed word as quickly as possible. The same Windows laptop was used to conduct the experiment as in LDT 1, 2 and 3.

The choice of stimuli for LDT 4 was established as a consequence of two ratings that revealed the degree of familiarity and foreignness of the test items previously used in LDT 2 and 3. In the ¿rst rating of compounds vs. monomorphemic words, participants got a list of 96 items and had to decide whether or not the word on the list was a foreign word, and they had to judge spontaneously how familiar they are with this word. While the foreignness rating was a decision task (Yes or No), the familiarity rating had a range on a scale from 1 (well known) up to 5 (unknown). This was conducted using an interactive PDF (Adobe Acrobat DC). Each rating was either forwarded immediately to the investigator via email or all ratings ¿lled in by hand on a printed form were handed in by the participants. The second rating of derivations vs. monomorphemic words used an online questionnaire, programmed and provided by Markus Christiner on an online platform, URL: https://quest.christiner.at/. Similarly to the ¿rst rating, participants had to decide on the foreignness and familiarity of 96 stimulus words that were presented one after the other, whenever the participant clicked on the *Weiter* 'next/go on' button. It was not possible to measure reaction times.

#### 3.4. MATERIALS

In each of the tasks the experimental items were 96 German words and 96 German-based non-words (one letter of a German word was changed in the monomorphemic words, two letters were changed in the compound and derivation-based items, i.e. one consonant or vowel in each of the two morphological parts), with the exception of LDT 4, in which the stimuli were 96 German words and 32 German-based non-words, divided into four conditions, as presented in Table 1:


Table 1. Conditions of the test stimuli

This results in 24 words and 24 non-words per condition. Half of the items contained a morpheme boundary, while the other half did not, which is exactly the same as the items containing or not containing a consonant cluster. In LDT 1 and the PDT (Freiberger et al. 2015), stimulus words contained only biconsonantal clusters (e.g. /gr/, /br/), whereas in LDT 2 (Korecky-Kröll et al. 2016), LDT 3 (Sommer-Lolei et al. 2017) and LDT 4, bi- and triconsonantal clusters occurred (e.g. /rtn/, /lst/). The position of the consonant clusters was either in the ¿nal (LDT 1) or in a medial position (LDT 2, 3 and 4). For every morphonotactic consonant cluster there existed a phonotactic match (e.g. LDT 2 on compounds: (M+P+) *Haus+tier* 'domestic animal' vs. (M–P+) *Kastanie* 'chestnut'). Therefore, we also matched conditions 2 and 4 (e.g. LDT 3 on derivations: (M+P–) *Zeig-er* 'pointer' vs. (M–P–) *Lager* 'storage'). Examples of stimuli for each experiment are listed in Table 2 for a summarizing

overview of the tasks, see Tables 3 and 4. All of the items were balanced for word length, syllables and average word frequency (taken as the number of occurrences from the CELEX, the Austrian Media Corpus (AMC7 ) and also the *Leipzig Deutscher Wortschatz Online* databases).


Table 2. Examples for German words used in the experiments per condition

#### 3.5. PARTICIPANTS

All the participants were adult monolingual native speakers of Standard Austrian German. None of them reported visual or neurological impairments or a history of language disorders.

In LDT 1, in which processing of inÀectional forms was compared to monomorphemic word processing, 46 adults (aged 19 to 35) participated.

In the progressive demasking task, in which the same inÀectional forms and monomorphemic words taken from the previously conducted LDT 1 had to be identi¿ed, 45 adults (aged 19 to 31) participated, likewise in LDT 2, in which the processing of monomorphemic words was compared to the processing of compounds and in LDT 3, where the processing of derivations compared to monomorphemic words was investigated.

The ¿rst rating of foreignness and familiarity with regard to the list of compounds and monomorphemic words was conducted with 130 adult participants (aged 18 to 51) the second rating of the stimulus word list

 <sup>7</sup> The Austrian Media Corpus of the Austrian Academy of Sciences, based on APA (Austrian Press Agency) data, consists of over 10 billion word tokens. URL: https:// www.oeaw.ac.at/acdh/tools/amc-austria-media-corpus/ >20.03.2019@.

of derivations and monomorphemic words was performed by 102 adults (aged 20 to 59). Both ratings were used for stimulus word selection in LDT 4.

In LDT 4, as our most recently performed experiment, in which 84 adults (aged 18 to 36) participated, we united the previous two experiments LDT 2 and 3 in order to investigate the processing of compounds and derivations compared to monomorphemic words, in consideration of the familiarity and foreignness ratings of the stimulus words, where we selected in equal parts very familiar, familiar, very unfamiliar, very foreign, foreign and non-foreign stimulus words (compounds, derivations and monomorphemic nouns).

For a summary of all the performed tasks, ratings and participants, see Tables 3, 4 and 5:


Table 3. Overview of the visual word recognition tasks on inÀection




Table 5. Overview of the familiarity and foreignness ratings

#### 3.6. RESULTS OF THE PROCESSING EXPERIMENTS

In the ¿rst whole-word recognition tasks, conducted by Freiberger et al. (2015), divergent results were found. Regarding the LDT 1 and the PDT on inÀection, they found that in both experiments words with a morpheme boundary were signi¿cantly more dif¿cult to process than all the other categories. In particular, the category M+P+, which are strings that contain a morpheme boundary and a consonant cluster, showed the highest latency, and in the lexical decision task also the lowest accuracy. This is in accordance with processing models which assume that af¿xed words are decomposed into base form and af¿x, which leads to higher processing costs. They did not ¿nd this accuracy result in the PDT instead words without a morpheme boundary and with a consonant cluster (M–P+) were processed more accurately. The longer reaction times for the M+P+ items can be explained by the fact that these strings were all inÀectional forms. Therefore, Freiberger and colleagues concluded that it is problematic to compare a non-citation form with a citation form, as already mentioned above, but also, this may be due to the fact that German is only a weakly inÀecting language, in which inÀectional morphology is not important enough to facilitate lexical processing. In the strongly inÀecting language Polish, Zydorowicz and Dziubalska-Koáaczyk (2017) found that the morpheme boundary helped processing. The results furthermore show that it is not the cluster that renders word recognition more dif¿cult, but rather the morpheme boundary combined with the cluster. In this case the Strong Morphonotactic Hypothesis could not be con¿rmed for German, although there had been some evidence for it on the sublexical level in previous experiments (see 3.1).

In order to verify these ¿ndings and to test the hypothesis concerning the impact of citation forms, Korecky-Kröll et al. (2016) conducted the second experiment, LDT 2, on compounding. Since inÀected word forms were included in the ¿rst experiments (Freiberger et al. 2015, see above), we wanted to see whether words that are citation forms, like monomorphemic words, and that derive from a morphologically rich domain of German, much richer than inÀection, would lead the participants to a different behaviour. The results for accuracy showed that compounds with morphonotactic clusters (M+P+ items, with morpheme boundary and consonant cluster), but also M+P– items (compounds without a consonant cluster) show signi¿cantly higher accuracy than either type of monomorphemic words. This diverges from the results of the previous lexical decision experiments on inÀection, and therefore supports the Strong Morphonotactic Hypothesis. The results for latency showed no signi¿cant difference, but a trend in favour of M+P+ compounds (with morpheme boundary and with consonant cluster).

These results have demonstrated that the decomposition of compounds does not slow down processing but is, rather, automatic, which supports Libben's (2014) principle of maximum opportunity.

Korecky-Kröll et al. (2016) therefore concluded that the facilitation process in the acquisition and processing of morphonotactic clusters only seems to apply in a language or linguistic domain that is morphologically rich, and it was suggested that the Strong Morphonotactic Hypothesis be modi¿ed accordingly (also in Sommer-Lolei et al. 2017). Thus, the interaction with morphology appears to facilitate processing where it "is worth it´.

The third lexical decision experiment on derivational morphology (LDT 3) was performed by Sommer-Lolei et al. (2017) in order to support our theoretical claim that the morphological richness of a certain area facilitates processing, i.e. by also investigating the derivational domain.

The results of the LDT 3 demonstrated that M+P– items in particular (derivations containing a morpheme boundary, but no consonant cluster), but also M+P+ items (derivations containing a morpheme boundary and a consonant cluster), yield a signi¿cantly higher accuracy and are processed signi¿cantly faster than both types of monomorphemic words. Derived nouns (derived via productive word-formation rules) were processed more accurately than simplex nouns. Unlike the previous experiment on compounding (Korecky-Kröll et al. 2016, see above), these results do not support the Strong Morphonotactic Hypothesis directly but only indirectly, insofar as we found a positive effect on processing whenever a morpheme boundary is present.

To demonstrate differences in the processing of morphonotactic and phonotactic consonant clusters in compounds, derivations and monomorphemic words (LDT 4), a repeated measures ANOVA was performed on the mean values of the correct responses. The results show, with a Greenhouse-Geisser correction, that words containing a morpheme boundary were processed signi¿cantly differently *F*(2.48,195.73) = 30.76, *p* < 0.01, regardless of whether the string contained a consonant cluster or not. Condition 1 (M+P+ Mean = 0.85) was signi¿cantly different from conditions 3 (M–P+ Mean = 0.79) and 4 (M–P– Mean = 0.79) but not from condition 2 (M+P– Mean = 0.84). M+P+ and M+P– items were signi¿cantly different from both categories without a morpheme boundary.

#### 3.7. RESULTS OF THE FAMILIARITY AND FOREIGNNESS RATINGS

In our lexical decision tasks on compounds (LDT 2) and derivations (LDT 3) the high number of foreign words within the group of monomorphemic words (e.g. *Taifun* 'typhoon', *Baklava* 'baklava') that were used within the experiments still seemed problematic to us as a possible confounding factor. Use of these items was due to the necessity of matching compounds and derivations to monomorphemic words in frequency and word length.

Therefore, we had 130 German native speakers rate all our stimulus words for the compound experiment in terms of the degree of familiarity and foreignness, which revealed that the four categories of stimuli differed signi¿cantly in their perceived degree of foreignness (see Figure 1), whereas in the familiarity rating only monomorphemic words without consonant clusters (M–P–) were rated signi¿cantly less familiar than all other categories (see Figure 2):

Figure 1. Degree of foreignness in the four categories (Rating of stimulus words for LDT 2)

Figure 2. Degree of familiarity in the four categories (Rating of stimulus words for LDT 2)

The results for compounds and monomorphemic words show that foreignness and, especially, unfamiliarity reduce accuracy and delay the speed of processing signi¿cantly.

Although familiarity and foreignness were less likely to constitute intervening variables for derivations than for compounds, we nevertheless had the 96 stimulus words rated for this experiment by 102 native speakers of German, which revealed a signi¿cant difference between the categories M– and M+ regarding the degree of foreignness (see Figure 3). In the familiarity rating we found no signi¿cant difference between the four categories (see Figure 4).

Figure 3. Degree of foreignness in the four categories (Rating of stimulus words for LDT 3)

Figure 4. Degree of familiarity in the four categories (Rating of stimulus words for LDT 3)

#### 4. STATISTICAL ANALYSES

#### 4.1. IMPACT OF FREQUENCY, FAMILIARITY AND FOREIGNNESS ON ACCURACY

As mentioned above, we hypothesized that familiar words, ranked in advance by native speakers of Standard Austrian German (see also 3.4), would be processed signi¿cantly more accurately than unfamiliar words and second, that stimulus words judged as foreign would result in signi¿ cantly lower accuracy than all of the other words.

To demonstrate the effects of frequency, familiarity and foreignness in adult language processing we analysed their impact in terms of accuracy of words (non-words were excluded) in LDT 2 (compounds vs. simplex nouns), LDT 3 (derivations vs. simplex nouns) and LDT 4 (compounds and derivations vs. simplex nouns).

As for LDT 2 (Korecky-Kröll et al. 2016), statistical analyses of the accuracy of words reveal that words with a high AMC token frequency are signi¿cantly more likely to be judged correctly and that high-frequency words containing a morpheme boundary, especially M+P+ items, are processed signi¿cantly more accurately than words without one. In terms of familiarity and foreignness, we found that stimulus words that are either unfamiliar and/or foreign have signi¿cantly less accurate results than familiar and non-foreign words. In addition, there is no difference in this respect in the processing of morphonotactic and phonotactic consonant clusters across the four categories.

In our LDT 3 on derivations compared to monomorphemic nouns (Sommer-Lolei et al. 2017), words containing a morpheme boundary have signi¿cantly more accurate results compared to monomorphemic words, regardless of the absence or presence of a consonant cluster. Also, we found that words without a consonant cluster and without a morpheme boundary (M–P–) are processed signi¿cantly less accurately than M–P+, which demonstrates a processing strategy of preferring a phonotactic consonant cluster whenever no morpheme boundary is present. In other words, the presence of a morpheme boundary leads to signi¿cantly higher results than the presence of a cluster. This indicates the relevance of morpheme boundaries.

Similar to our ¿nding in LDT 2, we found, with regard to familiarity and foreignness, that stimulus words that are either unfamiliar or foreign or both have signi¿cantly less accurate results than familiar/non-foreign words. In the case of familiarity of items, there is no difference in the processing of morphonotactic and phonotactic consonant clusters. However, when items are foreign, we found that words with a phonotactic consonant cluster (M–P+) are processed signi¿cantly more accurately compared to all other categories.

This appears to mean that (rather) foreign words of limited length are expected to be monomorphemic words, and this respects the fact that more simplex words are loaned than af¿xed words, in contrast to loaned English compounds.

By merging the previously conducted two experiments into one new experiment with respect to the degree of familiarity and of foreignness of the stimulus words, we found, with regard to the impact of token frequency, that words containing a morpheme boundary were processed with signi¿cantly greater accuracy compared to monomorphemic words, with M+P– items scoring highest. The processing of unfamiliar or foreign words in LDT 4 shows that these items are processed signi¿cantly less accurately, as in the previously conducted LDT 2 and 3. For familiarity we found no signi¿cant difference between the four categories. Foreign words containing a morpheme boundary have signi¿cantly less accurate results than words without it.

Statistical analyses of accuracy undertaken by means of a repeated measures ANOVA reveal differences in the processing of the three categories (compounds, derivations, simplex nouns). The results show that each of the three categories led to highly signi¿cant differences from each other: *F*(2,158) = 35.29, *p* < 0.01. Independently of whether the item was a word or a non-word, compounds were processed with signi¿cantly more accuracy than simplex nouns, whereas derived words resulted in an intermediary position, signi¿cantly different from both compounds and monomorphemic nouns (see Figure 5).

Figure 5. Mean values of accuracy in processing in compounds, derivations and simplex words

When analysing our data (LDT 2, 3 and 4) with regard to familiarity by using general linear mixed-effects models (Bates et al. 2015), we found that very unfamiliar words are processed with signi¿cantly less accuracy in all three experiments, but that there were no signi¿cant differences with regard to the presence vs. absence of a morpheme boundary or consonant cluster or both. Familiarity is shown to be a highly inÀuential factor when it comes to compounds (LDT 2 and 4), whereas in LDT 3 (derivations vs. simplex nouns), familiarity is an intervening variable to a lesser extent (see Table 6).

It is a novel ¿nding, as ¿rst presented by Sommer-Lolei et al. (2017), that familiarity is a more important variable in compounds than foreignness, which itself is more important than token frequency. The results for LDT 2 show that foreignness is much closer to frequency than to familiarity. Therefore, we can conclude that although all three variables are signi¿cant in respect of processing, familiarity is the most important factor when dealing with compounds, which is also a warning against relying excessively on frequency. Instead, other variables, in particular familiarity, should be considered as well.

Thus, results point to the fact that it is important whether a given string is a citation form or not, and, additionally, whether the item contains a morpheme boundary is rather important, although we only found effects of morphonotactic consonant clusters in LDT 2 and 4, which demonstrates the strong impact of compounds.

Overall, the presence of a morpheme boundary facilitates word recognition and processing except for unfamiliar and foreign words, regardless of whether there is a consonant cluster or not.


Table 6. Hierarchy of inÀuencing factors on the accuracy of German stimulus words

#### 4.2. INFLUENCE OF FREQUENCY, FAMILIARITY AND FOREIGNNESS ON LATENCY

Reaction times were measured for LDT 2 and 3, but as already mentioned above, this was impossible for LDT 4. The results demonstrate in both experiments that frequency has an impact in the sense that words with a high token frequency in the AMC corpus are processed signi¿cantly faster, regardless of whether the word contains a morpheme boundary and/or a consonant cluster or not. Thus, the positive inÀuence of a morpheme boundary on accuracy has no correspondence in latency. Interestingly, the interaction of morphology with phonological processing appears to increase accuracy, because accuracy is monitored on one additional level, whereas it neither slows down nor accelerates processing signi¿cantly.

In terms of familiarity, we ¿nd that unknown or very unfamiliar words were processed signi¿cantly slower in both experiments. In LDT 2 (com-

<sup>8</sup> The following signi¿cance levels were selected: \* 0.05, \*\* 0.01, \*\*\* 0.001. Please note that the negative z values for the foreignness and familiarity rating are due to the coding. For foreignness, this is more intuitive: if a word was rated as being not a foreign word, this was coded as 1, whereas 2 indicated that it was rated as a foreign word. Therefore, as expected, the accuracy of the participants is higher if the word is less foreign. However, the familiarity rating may appear somewhat counterintuitive as it was, instead, an unfamiliarity rating, ranging from 1 (well known or familiar) to 5 (unknown or unfamiliar). This leads to the negative z values: the participants' accuracy is higher if a word is less unfamiliar (i.e. more familiar).

<sup>9</sup> AIC refers to the Akaike Information Criterion (see e.g. Levshina 2015: 149), which was used as the primary criterion for model selection: the smaller the AIC value, the better the ¿t of the respective model.

pounds vs. simplex nouns), the analysis of reaction times shows that words containing a morpheme boundary but no consonant cluster (M+P–) delay processing signi¿cantly, followed by words with a morpheme boundary and cluster (M+P+), which show a weaker effect on latency. In LDT 3 (derivations), we did not ¿nd signi¿cant differences of this type.

Regarding the degree of foreignness, the analysis reveals that highly foreign words are processed signi¿cantly slower in both experiments which is also the case for stimulus words containing a morpheme boundary (with or without consonant cluster). Analysis of LDT 2 (compounding) shows that words with a morpheme boundary that also contain a consonant cluster (M+P+) delay processing with high signi¿cance, followed by words with a morpheme boundary and without a consonant cluster (M+P–), which tend to be processed slightly faster. Interestingly we ¿nd the opposite picture when analysing data from our LDT 3, which means that words with a morpheme boundary but without a consonant cluster (M+P–) are processed signi¿cantly more slowly than the M+P+ items. This points to the fact that the presence of a morphonotactic consonant cluster delays processing in foreign compounds but only shows a weak effect on the latency of foreign derivatives.

As summarized in Table 7, we conclude that with regard to the latency of compounding, familiarity is the major inÀuencing factor, whereas it is frequency that plays a highly important role when processing derivations.


Table 7. Hierarchy of inÀuencing factors on the latency of German stimulus words

Morpheme boundaries tended to be helpful on the sublexical level, as found by Celata et al. (2015) in the split cluster task, where frequency also had an important impact (see 3.1). By contrast, our results on lexical processing in terms of latency show signi¿cant delays in the presence of morpheme boundaries in unfamiliar and/or foreign words, whereas morpheme boundaries were helpful in familiar and non-foreign words. Frequency was always helpful, but often much less so than familiarity.

#### 5. CONCLUSION

As for German, no effect of the morphonotactic character of consonant clusters is shown in inÀection. Therefore, the Strong Morphonotactic Hypothesis is not supported for German inÀection, neither in ¿rst language acquisition nor in adult or adolescent language processing.

The Strong Morphonotactic Hypothesis could only be supported for German compounding, where the strongest facilitating effect was found for morpheme boundaries with consonant clusters. However, in derivatives too, positive effects of morpheme boundaries (with and without consonant clusters) on processing were found. Nevertheless, when compounds, derivatives and monomorphemic words were directly compared within the same participants, compounds showed signi¿cantly higher levels of accuracy than derivatives. This points to the second compound constituent being more readily identi¿able, due to its coexistence as an autonomous lexical element, compared to the harder process of retrieving suf¿xes (inÀectional or derivational suf¿xes). In processing, this is a consequence of the process of chunking elements (here phonemes and graphemes). Apparently morphological chunking is one of the normal processing strategies. Therefore, it is desirable to conduct further experiments in which morpheme and syllable chunking can be compared.

Words are processed faster and signi¿cantly more accurately, the more familiar a stimulus word is (particularly in compounding) and, to a lesser extent, the more frequent it is (particularly in derivations). This greater effect of familiarity can be linked to Libben's (2014) principle of opportunity: in compounds, the familiarity of both the whole compound and of the word families of its constituents facilitates processing, whereas the familiarity of the more abstract, i.e. much less morphosemantically descriptive suf¿xes must have a much smaller inÀuence. As a consequence, the frequency and productivity of suf¿xes has a relatively greater importance.

#### ACKNOWLEDGEMENTS

This investigation was performed within the International Cooperation Project 'Human Behaviour and Machine Simulation in the Processing of (Mor)Phonotactics'. We thank the Austrian Science Fund (FWF): >I 1394-G23@ for its support.

Sabine Sommer-Lolei is a recipient of a DOC-team fellowship of the Austrian Academy of Sciences. Markus Christiner's investigation is funded within the Post-DocTrack Programme of the OeAW.

We are sincerely grateful to Eva Maria Freiberger who worked as a project collaborator in the ¿rst project phase. She designed the LDT 1 and the PDT experiment, collected the data for these experiments and was also involved in their analysis. Furthermore, she did pioneering work on the acquisition of German morphonotactics.

We also want to thank Angelika Wukowits for supervising and executing one of the experiments together with Markus Christiner in the Secondary School for Economic Professions and the Educational Institution for Elementary Pedagogy at Sancta Christiana in Frohsdorf, Austria, with our special thanks to all students who took part in our task and to the director Dr. Alexander Kucera and Mag. Dr. Barbara Bohn for the general organization of the Science Day.

#### REFERENCES


### IV. Exploring phonotactic and morphonotactic constraints in the acquisition of consonant clusters in L1 French

B ARBARA K ÖPKE <sup>1</sup> O LIVIER N OCAUDIE <sup>1</sup> H ÉLÈNE G IRAUDO <sup>2</sup>

#### 1. INTRODUCTION

Consonant clusters are relatively rare in the languages of the world and count as phonologically marked (see e.g. Greenberg 1965 Clements & Keyser 1983 Maddieson 1984 Vennemann 1988 Blevins 1995 Dziubalska-Koáaczyk 2009). Nevertheless, they do ¿gure prominently in a number of languages such as German or Polish. The fact that most of these languages are inÀecting-fusional ones and that many of the clusters attested in them occur at morpheme boundaries suggests that morphological factors might be involved in the emergence and ¿rst language (L1) acquisition of clusters. In particular, the possibility that clusters may function as boundary signals (cf. already Trubetzkoy 1939) is likely to play a role.

The present study focuses on the time of emergence, position and phonotactic vs. morphonotactic status of consonant clusters (henceforth CC) in the acquisition of L1 French. In order to bene¿t from the growing number of resources made available by the scienti¿c community, we selected a corpus out of the CHILDES database (MacWhinney 2000), allowing for comparison with L1 acquisition data collected to test the Strong Morphonotactic Hypothesis (SMH) for German (Freiberger 2007, 2014). We then analysed longitudinal data from four children aged 16 to 30 collected in spontaneous speech interactions between a parent and the target child with a generalized linear mixed model investigating the role of the factors age, position and phonotactic (PH) vs. morphonotactic (MPH) status on the successful pronunciation of the different CCs.

 <sup>1</sup> LNPL – Laboratoire de NeuroPsychoLinguistique, University of Toulouse (UT2), Toulouse, France.

 <sup>2</sup> CNRS, CLLE-ERSS, University of Toulouse (UT2).

#### 2. CONSONANT CLUSTERS IN FIRST LANGUAGE ACQUISITION

#### 2.1. PHONOTACTICS AND CONSONANT CLUSTERS

A number of studies have investigated the role of phonotactics in the acquisition of the L1. It is generally assumed that phonotactics provide cues allowing the child to identify word boundaries in speech, hence conferring a key role to phonotactics in word segmentation. This is con¿rmed by growing evidence about how the acquisition of phonotactics bootstraps the acquisition of lexicon and grammar (cf. Boll-Avetisyan 2012). Moreover, phonotactics also play a role in adult language processing: for instance, McQueen (1998) showed that adults recognize words more rapidly when the junction between two words forms a phoneme cluster that does not typically occur within words. Such ¿ndings draw attention to the role of CCs and their signi¿cance in phonotactics since their nature and distribution varies a lot across languages.

The production of CCs has been shown to be particularly dif¿cult in L1 acquisition, in accordance with the idea that more marked structures (Trubetzkoy 1939) will be more dif¿cult to acquire than less marked ones. However, it has also been shown that the ease of acquisition of CCs is not homogenous and depends on a number of parameters such as syllable structure, position, frequency, morphology and input factors (e.g. Demuth & McCullough 2009), which also differ across languages. In the same vein, Levelt, Schiller and Levelt (2000) have shown that the order of acquisition of syllable structures "closely matched the frequency with which those syllable structures occurred in child-directed speech´ (cited by Demuth & McCullough 2009: 427).

Structural aspects of phonotactics have been comprehensively described in the Beats-and-Binding model of Dziubalska-Koáaczyk (2002), providing a scale of preference for markedness based on overall sonority as composed of sonority, place of articulation and voicing. Studies by Demuth and collaborators on L1 acquisition also take into account sonority (e.g. Demuth & McCullough 2009). In addition, it has been suggested that syllable-initial consonants are less marked than syllable-¿nal consonants and hence easier to acquire, the former also being present in a larger number of languages. The same should hold for CCs in an initial position, which should be easier to acquire than CCs in a ¿nal position. However, for analytic languages such as English, the contrary seems to be true, which has been explained through frequency and syllable structure effects, among other things (see Demuth 2007 or Demuth & McCullough 2009, for reviews). Data for Dutch (where word-initial and word-¿nal clusters are about equally frequent) suggest that some children produce clusters more easily in the initial position and others in the ¿nal position.

These data suggest a strong link between the acquisitional process and the structure of the target language. Demuth and McCullough (2009) have challenged these predictions with respect to French, based on the idea that analyses of child-directed speech show that around 70% of CCs are in the word-initial position, predicting a frequency advantage for word-initial clusters. This hypothesis is investigated with the analysis of the speech production of two children (Tim and Marie from the Lyon Corpus recorded by Demuth & Tremblay 2008) recorded repeatedly between ages 15 and 30. The results show higher accuracy in the production of wordinitial CCs than word-¿nal CCs as predicted, and this independently of syllable structure factors such as sonority. The study furthermore showed that word-medial CCs were produced with roughly the same accuracy as word-initial CCs in French.

#### 2.2. MORPHONOTACTICS AND CONSONANT CLUSTERS

Apart from frequency and structurally based factors, morphological structure has also been supposed to play a role in the acquisition of CCs. For example, Kirk and Demuth (2005) examined the production of CCs in an elicitation task in English-speaking two-year-olds and found that the children were more accurate in producing word-¿nal clusters consisting of obstruent +/s/ compared to word-initial clusters (e.g. cups vs. spoon) and, most interestingly, to ¿nal clusters with the same but reversed segmental content (e.g. cups vs. wasp). The authors explain these results through both input-related and morphological factors: obstruent +/s/ clusters are more frequent, but most importantly (and probably linked to the ¿rst factor), they involve an inÀectional (and very productive) morpheme. This could lead to a morphological advantage for ¿nal clusters, which, in turn, may help children acquiring languages such as English to focus very early on complexity at the end of the words. Furthermore, morphological effects appear very early in language perception and production. Numerous studies suggest that infants start perceiving functional morphemes at an early age. They distinguish forms of function words (free morphemes) from content words and decode speci¿c function words during the ¿rst year of life (e.g. Shi & Werker 2003 Shi, Werker & Cutler 2006 Shi, Werker & Morgan 1999).

A recent study conducted by Marquis and Shi (2012) provided the ¿rst empirical evidence that French-learning 11-month-olds can use the encoded bound morphemes for interpreting the internal units of newly encountered words. They demonstrated that infants analyse the wordinternal morphology during their ¿rst year of life before learning word meaning, and that the decoding of a bound functional morpheme depends on the morpheme frequency. This ¿nding suggests that rudimentary representations of morphological alternations emerge very early in the longterm memory of infants. Finally, in an application of the Beats-and-Binding model to L1 acquisition, Zydorowicz (2007, 2009) has shown that Polish-speaking children reduce morphonotactic clusters less frequently than phonotactic ones, similarly to what has been shown for Lithuanian and English (Kamandulytơ 2006 Kirk & Demuth 2005).

Despite such promising results, investigations of how morphotactics facilitate the acquisition of phonotactics in children are still rare. However, Dressler and Dziubalska-Koáaczyk (2006) and Dressler, Dziubalska-Koáaczyk and Pestal (2010) have elaborated a model of morphonotactics and its correspondence to phonotactics that has been tested in preliminary studies on L1 acquisition of German (Freiberger 2014). The Strong Morphonotactic Hypothesis (SMH) issued within this framework assumes that typically developing children should acquire morphonotactic clusters earlier than comparable purely phonotactic ones, because morphonotactic clusters are likely to be expressed more consistently in the input to which the children are exposed. Moreover, their segmental constituents will also occur independently of each other in phonologically less marked con- ¿gurations. The investigation of morphonotactic clusters acquired during different acquisitional phases of productive morphology (cf. Bittner, Dressler & Kilani-Schoch 2003) should allow us to establish which of the two factors plays a more important role. The framework also assumes that morphonotactic clusters with many phonotactic counterparts lack a morphological signalling function. Moreover, they may often be affected by the same repair mechanisms as parallel phonotactic clusters and expressed less faithfully in the adult speech input than morphonotactic clusters with few or no phonotactic counterparts. Therefore, they also ought to be acquired less easily and during later phases. Nevertheless, adult-like production of morphonotactic clusters may precede (in terms of the ¿rst emergence or frequency of occurrence) the production of homophonous purely phonotactic clusters.

Furthermore, phonologically more marked clusters should be acquired later than less marked ones (independently of whether they are phonotactic or morphonotactic). Still, it needs to be investigated how the acquisition of morphonotactic clusters is affected by the absence or presence and also the frequency of parallel phonotactic ones. The framework also stresses that the measurement of markedness should differentiate between different positions in the word and give much consideration to ease of perception (such as the Net Auditory Distance among segments, as proposed by Dziubalska-Koáaczyk 2009).

These assumptions have been investigated by Freiberger (2014) through the analysis of a longitudinal corpus of spontaneous speech data from 3 typically developing monolingual children acquiring German as an L1 in Austria that was analysed with respect to the interaction of phonotactic and morphotactic factors. For each child, the researcher selected 30 minutes of recordings per month between age 16 and 303 . Three developmental phases were distinguished during this period. The data were transcribed according to CHILDES norms, all spontaneously produced clusters were extracted, and correctly vs. incorrectly pronounced clusters were analysed with respect to the number of consonants, position, morphonotactic vs. phonotactic status and number of morphological boundaries. The results showed the expected progression in accuracy with age and that the children had more dif¿culties with initial than with medial and ¿nal clusters (similarly to what has been shown for English). While morphonotactic clusters did not involve additional dif¿culties due to their complexity, there was no interaction between morphological and phonological factors as expected.

Given the speci¿cities in the acquisition of morphonotactic CCs to be expected in different languages varying with respect to inÀectional patterns and phonological typology, the present study seeks to complement existing data on the acquisition of CCs in Polish, German, English and Lithuanian with data from French that, as a Romance language, can be expected to show a different acquisitional path. While some data exist on phonotactic factors, previous studies on CC acquisition in French have not taken morphological factors into account.

#### 3. METHODOLOGY

The aim of the present study was to explore the acquisition of CCs by L1 French children, their time of emergence and speed of acquisition, as well as to scrutinize aspects of the SMH (Dressler & Dziubalska-Koáaczyk 2006) in L1 French acquisition. In order to provide data that

 <sup>3</sup> One of the children was recorded from age 13, allowing the author to take into account a fourth developmental stage, T0, for this child.

can be compared to other related studies within this framework, the methodology chosen is based on a replication of Freiberger's (2014) analysis of a corpus with data from 3 children aged 16 to 30.

#### 3.1. CORPUS SELECTION

An increasing number of resources on L1 acquisition have been made available to the scienti¿c community in recent years. For the present study, we referred to the CHILDES database (MacWhinney 2000) and selected a corpus with a comparable recording process to the methodology used by Freiberger (2014) in order to test the SMH in L1 German acquisition. The criteria we used were the recording of spontaneous speech interactions between a parent and the target child, an age of onset of 16 and regular recordings up to 30. The corpus recorded by Demuth and Tremblay (2008) (hereafter 'Lyon Corpus') meets these requirements: four children were, on average, recorded bimonthly (Anaïs, Marie, Nathan and Theotime) from an age onset of 10 to the age of 30 (40 for Marie) and the children's utterances have been orthographically and phonetically transcribed in CHAT format (MacWhinney 2000). The data of one of the children (Marie) had already been analysed with respect to the acquisition of CCs (together with data from Tim in Demuth & Mc-Cullough 2009). A further advantage of the Lyon Corpus is that data for child-directed speech are available from the mothers of the children.

#### 3.2. METHODOLOGY AND CHARACTERISTICS OF THE CORPUS

The data analysed here are summarized in Table 1. We have analysed 18 recordings per child for Anaïs, Marie and Theotime and 20 recordings for Nathan. The table shows that the number of word tokens is highly variable from one child to another – as can be expected in this age group – ranging from 6,396 tokens (Nathan) to 22,729 tokens (Anaïs). First of all, we analysed the orthographic transcription of the corpus. To start with, we performed a frequency count of each lexical item transcribed from these recordings and sorted all words containing CCs with the help of the FREQ command in CLAN (freq\*.cha +t\*CHI) followed by manual extraction of the targets. The following tokens were excluded from this analysis: some complex lexical units (*parce que* 'because')4 , proper

 <sup>4</sup> In what follows, we provide English translations for the French lexical units, but not for proper names and interjections or onomatopoeia.

names (*Amtaro*) and onomatopoeia (*vroum*). The analysis shows that infant speech in French involves an interesting, albeit also variable proportion of words containing CCs ranging from 4.4% for Anaïs to 10.6% for Theotime).


Table 1. Number of recordings and tokens per child, percentages of CC tokens in the corpus

Further analysis of the words produced by two of the children (Marie and Nathan) demonstrates the variety of CCs found in the speech of the children. Table 2 presents a summary of the cluster combinations according to consonant type of the ¿rst and the second consonant. Unsurprisingly, the table clearly shows that combinations of plosives and liquids are most frequent in infant speech (see also Demuth & McCullough 2009). The data also provide evidence for the presence of some more complex clusters involving three or four consonants (see Table 3).


Table 2. Variety of CCs in Marie and Nathan's recordings

Table 3. CCs with 3 or more consonant sounds in Marie's and Nathan's recordings


We also analysed the number of CCs for different developmental stages, similarly to the analyses by Freiberger (2014). The data summarized in Table 4 show a clear progression of the number of CCs with age for each child. However, bear in mind that these data were obtained with the orthographical tier and remain hypothetical with respect to the actual production of the CCs. As such, they mainly demonstrate the diversi¿cation and complexi¿cation of the lexicon in each child.


Table 4. Number of CCs in the word tokens per child for each age group

Qualitative analysis of the words (see Tables I and II in appendix) shows that the ¿rst words containing CCs produced by French L1 children are from various categories with an important proportion of nouns (e.g. *nounours* 'teddy'*, veste* 'jacket'*,* pcharpe 'scarf' *ÀHXU* 'Àower'*, fraise* 'strawberry') and interjections/onomatopoeia (e.g. *bravo, oups, vroum, gling, clac, crac*) followed by adverbs or adverbial locutions (e.g. *s'il-te-plaît* 'please'*, après* 'after'*, autre* 'other'*, plein* 'lots'*, trop* 'too much') and verbs (e.g. *regarde* 'look'*, prend* 'take'*, parti* 'gone'*, marche*  'walk'*, ferme* 'close'), mostly in in¿nitive or participle constructions.

We then sorted words containing CCs that were classi¿ed as either phonotactic or morphonotactic in the productions of the four children (see Table 5). This was achieved with the FREQ command provided by CLAN. In addition, manual extraction of our targets allowed us to establish the CC's frequency in French and their phonemic variety in infant speech. Then, the CCs presented in Tables 2 and 3 were set up as targets we systematically searched for within a subset of the Lyon Corpus. This subset is composed of around 20 hours of recording for each child and balanced across different time frames. Within this subcorpus, we compared for each target token the expected phonological form and the child's actual pronunciation. Through KWAL commands followed by manual extraction, we sorted 5,276 occurrences which were categorized according to the following criteria: child, age, gender, CC group (Table 2 headers), CC (Table 2 contents), CC position (initial, medial, ¿nal), PH vs. MPH (phonotactic vs. morphonotactic cluster), grammatical class, lexical form, number of syllables, phonetic realization, CC realization (0 = error 1 = success), and the type of CC error (reduction, substitution, omission, repetition, epenthesis, shifted cluster or mixed sounds).


Table 5. Total number of phonotactic (PH) and morphonotactic (MPH) clusters in each position for the four children

Table 5 shows that there are more than four times as many phonotactic clusters (N = 4,259) than morphonotactic clusters (N = 1,010) in the recordings. While PH clusters appear in all positions and even slightly more in the word-initial position, MPH clusters mostly appear in the ¿nal position and to a lesser extent in the middle of words, but not at all in the word-initial position. The latter ¿nding is not surprising since wordinitial MPH clusters are non-existent in French and consequently absent from child-directed speech (Demuth & McCullough 2009).

Table 6. Distribution of the different types of MPH clusters in word-medial and word- ¿nal position


Table 6 shows the number of tokens and distribution of the different MPH clusters in the corpus. It is obvious from the data that some of the clusters are very rare and appear in only one speci¿c word. For example,

<sup>5</sup> The data concerning these items have, however, to be treated with caution. First, the morphotactic status of the cluster is questionable since the different forms of the verb *regarder* 'to look' are principally opposed to the noun *regard* 'gaze' (rare in child language!), and the noun is derived from the verb, and not the other way around (Dressler, personal communication). Additionally, as we will see later, this cluster is very frequently reduced through omission of the liquid by the children, but also in child-directed speech (Demuth & McCullough 2009).

'sf' is limited to the word *transforme* 'transform' and only appears in this con¿guration in a word-medial position. Other clusters show high overall frequency, but again, they appear in only one very frequent token (*pourquoi* 'why'*, regarde* 'look').

We then proceeded to the analysis of the phonological tier allowing us to establish the proportion of CCs correctly produced. Table 7 recapitulates the percentage of correctly produced clusters for each child in the different developmental stages. The high variability across the children is striking. While two of the children (Marie and Theotime) reach more than 80% correct performances by age 3, the proportion is only 40% for Nathan (and not yet stable), and it does not exceed 10% for Anaïs, for whom no real progression is evident over the recording periods. Furthermore, it cannot be excluded that the actual production of CCs is related to the use of words containing CCs, the two children obtaining the highest mastery of CCs also being those who produce the most words containing them (see Table 4). However, it also has to be noted that Anaïs, who shows clear dif¿culties with CCs, seems to be rather talkative, as indicated by the number of tokens produced (see Table 1).


Table 7. Percentage of correctly produced consonant clusters per child and per age group

We then applied a generalized linear mixed model investigating the role of the following factors: age, position and PH vs. MPH status on the successful pronunciation of the different CCs. The results of this analysis are presented in the next section.

#### 3.3. STATISTICAL EXPLORATION OF THE FACTORS FAVOURING SUCCESSFUL PRONUNCIATION OF CCS

For the statistical exploration, a generalized linear mixed model was chosen because our data consider a binary nominal variable (is a CC produced correctly or not) in relationship with a list of factors, some of these being random (e.g. the child's pro¿ciency or the dif¿culty in pronouncing a CC) and others considered as independent variables (e.g. PH/MPH status). With such a model, the slope of the logit is modelled over time (that is, the ¿xed effect on the probability that a CC is correctly produced), depending on the position of the CC, its PH or MPH status, the number of syllables in the word, and so on. We have de¿ned two types of random effects:


Model 0 inspected the evolution of performance depending on the age of the children. The 4 children and all 10 CC classes (e.g. *liquid+plosive* or *plosive+obstruent*) were taken as random effects in the model. The results show that with growing age, the probability of a 'correct' answer increases signi¿cantly (ȕ = 0.63, *p* = 0.001), as can be expected with typically developing children. Figure 1 represents the probability slope that should be expected for any child, following Model 0's data.

Figure 1. Percentage of consonant clusters correctly pronounced per age group (Model 0)

For Model 1, we added the morphonotactic status of the CC (PH vs. MPH) to the factor age and took into account the possible interactions between these factors. As for Model 0, an increase in age results in an increase of the probability that a CC is correctly produced (ȕ = 0.71, *p* = 0.001). The model also suggests that the MPH status over all age groups had a positive effect on the pronunciation of CCs (ȕ = 0.72, *p*  = 0.02). However, the negative interaction between the factor age and MPH status tempers this result (ȕ = -0.37, *p* = 0.001). To re¿ne these ¿rst results, and as MPH CCs are not spread evenly across positions in the word, the following statistical models take into account the position of the CC in the word.

Model 2 considered the CCs located in the word-initial position as well as the age groups. As CCs in this position cannot have MPH status, the latter factor was excluded from the model. Model 2 reports a positive effect of increasing age (ȕ = 0.59, *p* = 0.001) and the initial position of the CC (ȕ = 0.75, *p* < 0.01) on the success of pronunciation of a CC. The interaction between these two factors also has a positive, albeit slight, effect (ȕ = 0.15, *p* = 0.001).

Model 3 investigated CCs located in the word-medial position, age groups, and the CC's MPH status. Again, rising age results in an increase in the probability that a CC will be pronounced correctly (ȕ = 0.63, *p* = 0.001). Nonetheless, when considered alone, the MPH status of the CC (ȕ = -1.39, *p* = 0.001) and the medial position of CCs (ȕ = -0.73, *p* < 0.01) seems to decrease the chance that a CC will be produced correctly. If we now consider the interactions in Model 3, there is a positive interaction between MPH status\*medial position (ȕ = 1.76, *p* = 0.001) with respect to correct pronunciation of the CC, while a less pronounced positive effect is observed for the age\*CC's medial position interaction (ȕ = 0.20, *p* < 0.01).

Finally, Model 4 focused on the CCs in the word-¿nal position, while also taking into account age and MPH status. The age factor remained, once again, the major effect (ȕ = 0.80, *p* = 0.001) in the model. When isolated, factors such as the ¿nal position and MPH status had no signi¿ cant effect in this model. However, the interactions age\*¿nal position (ȕ = -0.35, *p* = 0.001) and MPH status\*¿nal position (ȕ = -0.54, *p* = <0.05) showed a moderate negative effect on the probability that a CC will be produced correctly.

To summarize our results, the statistical analysis showed that the factor age was the most important inÀuence on the children's output, as can be expected from typically developing children. Concerning the CC's position in the word, French children tend overall to a left-side preference in the development of the pronunciation of CCs, with word-initial CCs tending to be produced correctly at an earlier age. As for the CC's MPH status, our models observed that only MPH clusters in a medial position (and probably speci¿cally for the latest age group considered here) had positive effects on the success of pronunciation of clusters. That being said, it must be kept in mind that these results may be inÀuenced by several factors inherent to our corpus:


'why' with a cluster in the medial position, or *regarde*<sup>6</sup> 'look', in imperative mode, with a cluster in the ¿nal position. If *regarde* is used as a deictic from very early on (543 tokens, from age 16 in the corpus) by both children and caregivers, *pourquoi* (162 tokens, from age 20 in the corpus) emerges later in the child lexicon, i.e. at a more advanced developmental stage.

Nevertheless, the results also showed that there is a lot of variation among the children, which is why the next section seeks to explore further the individual developmental trajectories.

#### 3.4. EXPLORATION OF THE INDIVIDUAL DEVELOPMENTAL TRAJECTORIES

The developmental trajectories for each of the four children were explored with reference to the type of the given responses, ¿rst for all CCs and then speci¿cally for MPH clusters. The results for overall CC production (see Figure 2) illustrate the inter-individual differences: while two of the children (Anaïs and Nathan) produce hardly any correct clusters during the entire period of investigation (16–30), the other two (Marie and Theotime) show a steady increase in correct production from age 20 (Marie) and age 16 (Theotime) respectively. Additionally, error patterns are also interesting and demonstrate a speci¿c trajectory for each child. While substitutions are rare in the production of all four, the children differ considerably with respect to the amount and distribution of the other error types. Theotime is very performant from the beginning, and omissions of CCs are rare in his speech even at the ¿rst recording, and remain close to zero thereafter. By preference he uses reductions of CCs, but even these are exceeded by correct productions from as early as around age 18. Both Nathan and Marie start with 100% omissions at age 16, but these are caught up by reductions at around age 18 for Marie and 24 for Nathan. Anaïs, who shows the most severe dif¿culties with CCs, has a quite different distribution of these two error types: from the beginning, she produces more reductions than omissions, and despite a lot of variation in the different recordings, this does not change substantially over the period investigated.

We then looked speci¿cally at the distribution of these error types and the correct productions of CCs in MPH clusters only (see Figure 3). Anaïs' CC productions remain close to zero during the whole investiga-

 <sup>6</sup> But see our reservations about the MPH status of the ¿nal cluster in *regarde* in note 3 (section 2.2).

Figure 2. Percentage of occurrence of each type of CC event coded in the corpus for each child/age group. Correctly produced CC (blue), CC omission (red), CC reduction (green), CC substitution (purple).

Figure 3. Percentage of occurrence of each type of event coded in the corpus for MPH clusters only \* child \* age group. Correctly produced CC (blue), CC omission (red), CC reduction (green), CC substitution (purple)

tion period, while errors seem to be distributed randomly between reductions and omissions. In Nathan's speech, correctly pronounced MPH CCs appear only at the end of the period, from age 29 onwards, and reductions start to exceed omissions from around age 25 onwards. Marie shows a similar pattern, but much earlier: correct productions appear at age 20 and the number of reductions exceed the omissions from around age 110. For Theotime the trajectory here is more similar to the pattern shown by Nathan and Marie, but even earlier than Marie and with higher success rates. Despite the high variability across children, these data suggest that there is an overall developmental pattern with the number of omissions decreasing while the number of reductions increases across the age groups.

#### 4. DISCUSSION

The results of our data analysis are in line with the other studies on the acquisition of consonant clusters in French, showing that all these clusters are subject to different types of modi¿cations (namely reduction to the least sonorous segment, gliding, insertion of schwa, etc.) before they are produced in a target-like manner. This has been documented, for example, in the study conducted by Andreassen (2013) on 13 monolingual children aged 22–32 years. The data are also consistent with the studies on CC acquisition in general (for a review, see McLeod, van Doorn & Reed 2001) showing that CCs emerge very progressively around age 2 (e.g. French 1989 Llpo & Prinz 1996, for Spanish and German). Overall, the studies on CC acquisition suggest that production of CCs starts around age 2, but, independently of the target language, their production is rarely correct at that age. Additionally, the age of mastery seems to be highly variable across children, as also documented by the data from the four children we considered in this study showing considerable variation in the correct CC productions of, for instance, Anaïs (10.07% correct at age 30–32) and Theotime (84.11% correct at the same age). With regard to American English, Shriberg and Kwiatkowski (1980) observed 90% correct at age 4, while Smit el al. (1990), on the contrary, report that only few productions were correct at age 4. In their sample the majority of CCs seemed to be mastered at age 6 or 7, though some dif¿culties persisted even up to age 8 or 9.

Among the factors generally found to inÀuence the age of mastery of pronunciation are the position and structure of the cluster. Yet the observations made with respect to these factors are still inconsistent. For example, McLeod and colleagues, in an extensive literature review on CC acquisition, state that "In languages other than English, word-¿nal CC have been reported to be acquired before word-initial clusters´ (McLeod et al. 2001: 101). However, as regards French, word-initial clusters tend to be produced earlier than word-¿nal clusters (see Demuth & McCullough 2009 Kirk & Demuth 2005). While it is also acknowledged that the position no longer matters in older children, such inconsistencies between studies need to be elucidated.

Additionally, McLeod et al. (2001) suggest that clusters involving plosive+liquid are easier to acquire than clusters involving fricative+liquid. This is in line with what has been observed by Demuth and colleagues (Demuth & McCullough 2009 Demuth & Tremblay 2008) for the acquisition of CCs in French. However, there are also a number of other, less investigated factors that are likely to play a role in language-speci¿c acquisition trajectories: the syllable structure, foot structure, frequency (e.g. Kehoe et al. 2008), saliency (Baroni 2014) and the primary and secondary status of the cluster (Andreassen 2013). For instance, there is a strong tendency for French towards monosyllabism resulting in a reduced number of morphemes per words and hence in fewer morphological boundaries and reduced morphonotactics.

Concerning our main research questions, the ¿ndings of the present study suggest that there are differences in the processing of MPH and PH clusters in the acquisition of L1 French. These are, however, modulated by the fact the MPH clusters mainly appear in the word-¿nal position in French (and to a lesser extent in a medial position), a position that seems, for other reasons, to be less favourable in early acquisition. The observations we made are consistent with the predictions of the SMH in respect of the phonological and morphological characteristics of French (Dressler & Dziubalska-Koáaczyk 2006). From a phonological point of view, French is a largely vocalic language and, as such, has only a limited number of consonant clusters. On the morphological side, French is one of the weakest inÀecting languages among the fusional languages (at least as far as oral French is concerned). Consequently, and as predicted by morphonotactics, the number of MPH clusters is rather low, at least in the vocabulary of the age groups investigated in the present study, and results have to be treated with caution. In order to counterbalance frequency effects, it might be interesting to compare data from younger children acquiring languages with many MPH clusters with older children acquiring languages with few MPH clusters.

Another issue that needs to be discussed is the actual presence of the investigated clusters in child-directed speech to which these children are exposed. Recent studies on speech directed to French infants suggest that French parents tend to reduce omission of schwa when addressing their young children (e.g. Andreassen 2011 Lipgeois, Saddour & Chabanal 2015). Reduced omission of liquids in consonant clusters has also been reported for child-directed speech in connection with one of the children investigated in the present study in respect of word-initial clusters, where parents tend to produce more complete clusters when addressing their young children compared to their usage in adult-directed speech (Demuth & McCullough 2009). However, things seem to be more variable with regard to word-¿nal clusters: while the results of the same study showed very low omission rates for /R/-plosive clusters (e.g. *barbe* 'beard' in our corpus – with the exception of *regarde* 'look', one of the most frequent productions in our corpus where omission rates are fairly high), omission rates were high for ¿nal plosive-/R/ clusters (75% for Marie's mother, e.g. *autre* 'other'). This suggests that the advantage observed for the acquisition of word-initial CCs is not only due to overall frequency but also to the fact that word-¿nal clusters are often phonetically reduced in the input (see Demuth & McCullough 2009, for more details).

#### 5. CONCLUSION AND PERSPECTIVES

Based on a longitudinal corpus analysis of natural data from four children, the present study provides preliminary data on the scope of the SMH (Dressler & Dziubalska-Koáaczyk 2006) as regards French. In accordance with the predictions of the SMH, our data show that the morphonological status of a CC does not *per se* facilitate its acquisition, which is largely modulated through other factors such as frequency and position in the word, at least in the initial developmental stages investigated in the present study. However, of course, the present investigation has only permitted a preliminary exploration of the issue of the interaction of phonological and morphological aspects in the development of CCs in French. An extension of the study to later developmental stages in older children, allowing us to weight the inÀuence of frequency, is clearly one of the primary perspectives arising from the present investigation. It would also be interesting to look more closely into the processing cost with regard to the processing of inÀected forms without clusters and to inspect error patterns related to PH/MPH status of CCs.

#### ACKNOWLEDGEMENTS

The present study was made possible through ¿nancial support from an ANR grant n° ANR-13-ISH2-0002-02 to Basilio Calderone for the Cooperation Project 'Human Behaviour and Machine Simulation in the Processing of (Mor)Phonotactics'. We would also like to thank Wolfgang U. Dressler for helpful comments on an earlier version of the paper and Sabine Sommer-Lolei and Nicola Wood for careful editing. All remaining errors are ours.

#### REFERENCES



#### APPENDIX


Table II. Types of clusters starting with liquids and their representations per position, their PH or MPH status, number and proportion

### V. The natural perceptual salience of af¿xes is not incompatible with a central view of morphological processing

H ÉLÈNE G IRAUDO <sup>1</sup> K ARLA O RIHUELA <sup>1</sup> B ASILIO C ALDERONE <sup>1</sup> B ARBARA K ÖPKE <sup>2</sup>

#### 1. INTRODUCTION

The Strong Morphonotactic Hypothesis (Dressler & Dziubalska-Koáaczyk 2006) assumes that phonotactics helps in the decomposition of words into morphemes. Accordingly, sequences occurring over a morpheme boundary (e.g. mords, mordent, mordre /mݓܧ/ /mݓܧd/ /mݓܧdݓ/ (correspond to a prototypical morphonotactic sequence that should be processed faster and more accurately than phonotactic sequences (e.g. ordre /ݓܧdݓ/(.

While this issue is usually explored within studies on typical and atypical ¿rst language acquisition, a recent study carried out by Korecky-Kröll et al. (2014) tested this hypothesis with German-speaking adults. One experiment using a letter search task (i.e. ¿nd a letter like, for example, *T* at different positions – initial, medial, ¿nal – in a visual word like *Taube* 'dove', *dank-te* 'thanked' and *pack-t* '(s/he) packs') investigated whether sublexical letter sequences were found faster when the target sequence was separated from the word stem by a morphological boundary (e.g. *pack-t*) than when it was a part of a morphological root (e.g. *Lift* 'lift'). The results showed that the presence of a morpheme boundary led to shorter reaction times (RTs) and fewer errors, whatever the target cluster position in the word. The authors concluded that phonotactics helps in the decomposition of words into morphemes without, however, explicitly considering that it is a direct consequence of a morphological decomposition mechanism taking place during lexical access.

More recently, Beyersmann et al. (2014) examined the morpho-orthographic hypothesis according to which all complex forms are segmented into morphemes during lexical access. The pre¿xed and suf¿xed let-

 <sup>1</sup> CNRS, CLLE-ERSS, University of Toulouse (UT2).

 <sup>2</sup> LNPL – Laboratoire de NeuroPsychoLinguistique, University of Toulouse (UT2).

ter strings they manipulated comprised real stems and af¿xes but never formed an existing word in French (e.g. *SURSRLQW¿OPXUH*). They used a letter search task in which adult French speakers had to decide whether the target letter was present or absent in a string of letters (e.g. 'R' in *propoint*  or *¿OPXre*). The results revealed that the letter search took longer in suf- ¿xes compared with non-suf¿x endings but not for pre¿xes compared with non-pre¿x beginnings. Moreover, performance was not affected by letter cluster frequency. The difference in processing suf¿xes relative to nonsuf¿xes was interpreted as reÀecting a chunking/af¿x stripping mechanism (cf. Taft & Forster's serial hypothesis 1975) that operates on functional units such as suf¿xes during lexical access. Furthermore, the authors interpreted the absence of effects for the pre¿xed non-words (i.e. no signi¿cant difference between pre-¿xed and non-pre¿xed non-words) in terms of the different semantic and syntactic functions of pre¿xes relative to suf¿xes.

Giraudo and Grainger (2003) also found an asymmetry in the processing of pre¿xed vs. suf¿xed words in a series of masked priming experiments conducted with French complex words. More precisely, they found morphological masked priming effects, but only for pre¿xed words (i.e. *enjeu* 'stake' primed *envol* 'Àight', while *ennui* 'boredom', a pseudocomplex in which *en-* is not a pre¿x and *lapin* 'rabbit', an unrelated word, did not induce any priming effect), suggesting that only pre¿x series are activated at the very early stages of visual word recognition. The authors interpreted this asymmetrical processing in terms of the different linguistic functions these two types of af¿xes imply, in particular the fact that pre¿xes usually carry more transparent semantic information than suf¿xes, whose function is much more related to syntax. Accordingly, morphological priming effects using af¿xed words would rely on the semantic relationships in prime-target pairs – pre¿x priming effects being clearly obtained. This result constitutes a challenge for decompositional models situating morphological effects at a sublexical level that should be insensitive to semantics. Moreover, the conclusions reached by Beyersmann et al. (2014) are not only contradictory to what Giraudo and Grainger (2003) found about af¿x processing in French but they are also incompatible with what Korecky-Kröll et al. (2014) found concerning the target cluster position, which did not interact with RTs in their experiment. Therefore, the conclusion about the kind of cognitive processes underlying af¿xed word analysis is not clear, while the interpretation of the effects obtained with a letter search task has to be re-examined.

As a consequence, and because Beyersmann et al. (2014) tested nonwords instead of existing words, meaning that they did not control the morphonotactic characteristics of their material, it is worth carrying out a new experiment using words and controlled materials in order to tease apart the morphonotactic effects from the positional effects in word recognition. Working on non-words presents the advantage of creating the materials very easily, especially when formal aspects have to be controlled. However, working on the morphological issue, the manipulation of nonwords, even if they are morphologically complex, restricts the conclusions derived from the results. First, the morphological structure of a given word actually corresponds to both the form and meaning information, and second, this word is embedded in a network formed by its morphological family and series. Therefore, the manipulation of non-words suppresses the two morphological dimensions of complex words, their syntagmatic and paradigmatic structure (see Blevins 2014 for discussion).

Finally, it is worth highlighting that, according to us, the letter search task does not permit examination of lexical access per se but rather morphological salience. Recently, Giraudo and Dal Maso (2016) discussed the issue of morphological processing through the notion of morphological salience – de¿ned as the relative role of the word and its parts – and its implications for theories and models of morphological processing. The issue of the relative prominence of the whole word and its morphological components has indeed been overshadowed by the fact that psycholinguistic research has progressively focused on purely formal and surface features of words, drawing researchers' attention away from what morphology really is, systematic mappings between form and meaning. While we do not deny that formal features can play a role in word processing, an account of the general mechanisms of lexical access also needs to consider the perceptual and functional salience of lexical and morphological items. Consequently, if we acknowledge the sensitivity to the word's morphological structure, we claim that it corresponds to secondary and derivative units of description/analysis.

In the present research, a letter search task was carried out using French words the main comparison is between the words that include the target letter after a morphonotactic boundary (e.g. *vivre* 'to live') and those with a purely phonotactic one (e.g. *centre* 'centre'). The hypothesis is that morphonotactic segmentation will be facilitated because of a double salience conveyed in the boundaries, as it is not only phonological but also morphological. Position effects will also be explored, as we compare the initial (position 1) vs. the ¿nal position (position 2) of letter targets. The results are presented with both a categorical analysis (comparing the different conditions) and a linear mixed-effects model.

#### 2. METHODOLOGY

#### 2.1. PARTICIPANTS

Thirty participants were recruited at the University of Toulouse, all native French speakers. The age range spanned from 18 to 30 years old (average 23) 7 were male and 23 were female. All participants were right-handed with normal or corrected to normal vision.

#### 2.2. ITEMS

Sixty French word forms (the same inÀectional form) were used as targets those used for the critical condition can be segmented morphotactically, while for the 3 other control conditions the syllabic segmentation was purely phonotactic (phonology and morphology did not correspond). All words included a combination of the letters 'RE', half of the time at the beginning of the word and half of the time at the end. Four conditions were created in order to counterbalance the position (initial– ¿nal), the boundary type (morphological and phonological or only phonological) of the bigram to be searched, and the part of speech status (verb or noun in singular and in¿nitive) of the word (see Table 1 for stimuli characteristics).

15 verbs morphonotactic (**MP verb\_RE/P2**): verbs that had the letters 'RE' in a ¿nal position (after a consonant at the end of the word), and corresponded to a morphological and phonological boundary. For example, *vivre* 'to live' contains *viv-* as a morphological base (stem) and *re-* as a suf¿x (the in¿nitive mark).

15 nouns phono ¿nal (**P noun\_RE/P2**): nouns that ended with 'RE', but the phonotactic boundary did not correspond to the morphological one. For example, in *centre* 'centre', *-re* is not a suf¿x and *cent–* is not a stem.

15 verbs phono initial (**P verb\_RE/P1**): verbs that started with 'RE', but the phonotactic boundary did not correspond to the morphological one. For example, in *refuse* 'refuse' *re-* is not a pre¿x and *-fuse* is not a stem.

15 nouns phono initial (**P noun\_RE/P1**): nouns that started with 'RE', but the phonotactic boundary did not correspond to the morphological one. For example, in *religion* 'religion' *re-* is not a pre¿x and *-ligion* is not a stem.

In order to perform the task, 60 ¿ller words (30 nouns and 30 verbs) matched in length and frequency with the target words, but not having 'RE' clusters in their spelling, were used. All word frequencies were taken from the *Lexique* database (New et al. 2001).


Table 1. Stimuli controlled for frequency and length

#### 2.3. PROCEDURE

Participants were seated 50 cm from the computer screen and asked to perform a 'letter search task'. They were instructed to respond as rapidly and accurately as possible to whether the cluster of letters 'RE' was present or not in the word to be displayed on the screen. Participants responded 'yes' by pressing one of two response buttons with the fore¿nger of their right hand and 'no' by pressing the other response button with the fore¿nger of the left hand.

The DMDX software (Forster & Forster 2003) was used. Each trial consisted of the following sequence of stimuli: the letters to be searched (RE) presented in uppercase (for 700 ms), followed by a ¿xation mark (1000 ms), a French word in lowercase (50 ms), which, in turn, was replaced by a mask (##########) that remained on the screen until the participant responded (for a maximum of 1500 ms). After 10 practice trials, participants received the 120 experimental trials in one block in a randomized order (see Figure 1 for a trial example).

Figure 1. Trial example for the letters search task (RE)

<sup>3</sup> Frequency (Fx) calculated in frequency per million from the Lexique corpus database (New et al. 2001). Length corresponds to the average number of letters.

#### 3. RESULTS 1

For the statistical analyses, ¿ller words (with no RE clusters) are not considered. Accuracy across all participants was above 80% but no signi¿cant differences were found (see Table 2 for accuracy and reaction time means). For the reaction time (RT), trials considered 'errors' were not taken into account (6% of the data), and a trimming procedure was used: excluding responses under 300 ms (1.6% of the data) and 2.5 SDs above or below the mean response time of each participant (2.89% of the data). For the RTs (see Figure 2 for RT means), an ANOVA was conducted using participants (F1) as a random factor, treating the boundary type as a within-participant factor (repeated measures).

Table 2. Accuracy (in error rates) and reaction times (in milliseconds) means


Figure 2. Reaction time results (ms) with standard error bars (SE) of the mean for each of the different conditions

A main effect of position was found, and planned comparisons found that the only signi¿cant difference among the four conditions was that the response times for the 'MP' condition were signi¿cantly faster *F*(3,59) = 8.72 *p* < .01 than those obtained for the 'P' with RE in the ¿rst position (beginning).

Focusing on the comparison between the two conditions where the target letters were at the beginning of the word and the only difference was the boundary type, the MP condition produced a facilitation effect (over the P one). To disentangle these ¿ndings and explore whether the effects are due to the fact that these conditions differ in grammatical category (part of speech), a comparison between these variables was undertaken, but it showed no signi¿cant difference.

#### 3.1. POSITION

An ANOVA revealed a signi¿cant difference between the initial and ¿nal position: *F*(1,59) = 9.92, *p* < .01, indicating that participants' responses are faster (by 76.33ms) when the target letters ('RE') are in an initial position, taking together conditions P(noun\_P1) and P(verb\_P1), versus when they are in a ¿nal one, MP(verb\_P2) and P(noun\_P2).

The average RT for the compiled values obtained for the conditions with 'RE' at the end or at the beginning of the word are shown in Table 3 and Figure 3.


Table 3. Mean average RT (and standard deviation, SD) of position (initial vs. ¿nal) in milliseconds

\*\* indicates a signi¿cant difference (*p* < .05)

Figure 3. Mean average RT of position (initial vs. ¿nal) in milliseconds with SE bars

Looking only at the 'position' effect, it can be argued that 'RE' is identi¿ed faster (regardless of whether the boundary is morpho-phonological or purely phonological) when it is at the beginning of the word this may be due to the fact that we follow a left-to-right reading direction. Nevertheless, in order to disentangle the ¿ndings, when comparing morphonotactic to merely phonotactic boundaries, there is a signi¿cant effect for the morphonotactic condition. If position is the only factor, manipulation or variable considered, then 'RE' is always found faster when at the beginning of the word but, within this position, 'RE' is found signi¿cantly faster when the boundary is morphonotactic. This is probably due to the dual information conveyed, which enhanced the morphological salience, aiding the analysis of the word into its constituents and facilitating identi¿cation of the target letters.

#### 3.2. PART OF SPEECH (GRAMMATICAL CATEGORY)

The average RT for the compiled values obtained for the conditions containing 'RE' in verbs or nouns are shown in Table 4. An ANOVA revealed no signi¿cant difference between them *F*(1,59) = 1.93, *p* = .17, indicating that even if participants' responses are slower (by 16.54 ms) when the target letters ('RE') are in a verb (condition MP\_verbRE and P\_RE\_verb) than when they are in a noun (P\_nounRe and P\_RE\_noun), this numerical difference is not statistically signi¿cant. The average RT for the compiled values obtained for the conditions containing 'RE' in verbs or nouns are shown in Table 4 and Figure 4.


Table 4. Mean average RT per part of speech (verb vs. noun) in milliseconds

Figure 4. Mean average RT per part of speech (verb vs. noun) in milliseconds with SE bars

#### 3.3 MORPHONOTACTIC VS. PHONOTACTIC

In the comparison of average RTs between the two conditions where the target letters were at the beginning of the word (P1) and the only difference was the boundary type, a facilitation effect was found for the morphonotactic boundary type (compared to the purely phonotactic). Figure 5 shows the average RT and signi¿cant difference between the type of boundary (MP or purely phonotactic).

Figure 5. Reaction time results in milliseconds with SE bars of the mean for morphonotactic vs. phonotactic P2 conditions

#### 4. RESULTS 2

It is also interesting to examine the possible correlation between the RT and a set of variables related to all formal (surface) aspects of the words, such as the length in terms of characters, the number of syllables and the orthographic neighbourhood. The reaction times (log) of correct responses were also analysed using a linear mixed-effects model. The ¿xed factor predictors included are the following:

a) condition (MP verb\_RE/P2, P noun\_RE/P2, P verb\_RE/P1, P noun\_ RE/P1)

b) ortho\_neigh (number of orthographic neighbours)

c) nbsyll (number of syllables),

d) nbletters (number of letters or word length)

e) TP\_ORT (orthographic transitions probability)

f) FrWaC\_freq (the log frequency of the form in FrWAC corpus, Baroni et al. 2009).


Table 5. Linear mixed-effects results of the reaction time data

Signif. codes: 0 '\*\*\*' 0.001 '\*\*' 0.01 '\*' 0.05 '.' 0.1 ' ' 1

Participants and items were included as random effects. In particular, the TP\_ORT variable reports the transitional probability of bigrams corresponding to the last letter of the target word (e.g. <v> in viV-re and the next subsequent letter of the letters to be searched (e.g. viVRe, that is the conditioned probability that an <r> follows a letter <v>). Table 5 shows the results obtained from the linear mixed-effects model and Figure 6 shows the signi¿cance of standardized ¿xed effects.

A signi¿cant effect of boundary type is found, showing that participants identify the letters within words that included morphological and phonotactic boundaries faster than in words with a purely phonotactic one (in P2). The intercept (morphonological condition: MP verb\_RE/P2, our 'base case') was signi¿cant, meaning that the RTs (for ¿nding the letter 'RE' in this condition) seem to reÀect a facilitation effect, while the phonological condition (P noun\_RE/P2) shows an inhibitory effect (longer RTs are needed to respond). In both of these conditions, the target letters are in the ¿nal position of the word (P2).

Figure 6. Standardized ¿xed effects showing that frequency, length and condition are signi¿cant

The model predictors are the following (see Figure 7):


Figure 7. Marginal effects of model predictors

#### 1. CONCLUSION

The Strong Morphonotactic Hypothesis (Dressler & Dziubalska-Koáaczyk 2006) was tested following the letter search task paradigm using words in French with morphonotactic and phonotactic boundaries with different positions for the targets across the materials. The target could be either at the beginning of the word (position 1) or at the end (position 2). Globally, the results showed that prototypical morphonotactic sequences were processed faster than phonotactic sequences, suggesting that phonotactics help us perceive the internal word structure in terms of morphological construction by enhancing their morphological salience. The presented results revealed that this was the case for position 2 (but not position 1): letter search times were longer when the target letters were embedded in a phonotactic condition compared to a morphonotactic one.

Our ¿ndings also provide indirect evidence for the left-to-right bias in word-recognition processing asymmetry across word beginnings and ends, and we assume that the mechanisms underlying printed word recognition are shaped by the physical constraints imposed by the reading direction (Giraudo & Grainger 2003). Responses were indeed signi¿cantly longer for the items that had the target letters in the second position (P2) compared to those in the ¿rst position (P1).

A signi¿cant effect of frequency was also obtained, showing that the more frequent a word is, the faster the reaction times are, while all other variables (like orthographic neighbours, transition probability and grammatical category) were found to be not statistically signi¿cant. According to our view of morphological processing (Giraudo & Voga 2014 Voga & Giraudo 2017), morphology plays a central role in the cognitive system at two levels: at a perceptive/surface level (when the morphological structure is salient, as is the case for morphonotactic words), and at a central level (where paradigmatic relationships organize the word representations coded in the mental lexicon). We claim that ¿nding sensitivity to morphology and effects of abilities is compatible with a paradigmatic/ construction view of morphology (e.g. Booij 2010). On the one hand, morphological salience can speed up lexical access in adult word comprehension and help to develop the morphological awareness of those learning to read. Morphological awareness refers to children's ''conscious awareness of the morphemic structure of words and their ability to reÀect on and manipulate that structure'' (Carlisle 1995: 194). Accordingly, it contributes to reading ability (e.g. Brittain 1970 Carlisle 1995 Deacon & Kirby 2004 Mahony, Singson & Mann 2000 Nagy, Berninger & Abbott 2006 Nunes & Bryant 2006 Kirby et al. 2012). Consequently, morphemes can provide cues for meaning, spelling and pronunciation (e.g. Carlisle 2003). On the other hand, construction representations link morphologically related words at a central level, and the presence or absence of connections is determined by the degree of semantic/functional relationships between the word forms according to their shared morpheme (base or af¿x). A fundamental assumption of this view is that construction representations are created/emerge and are stabilized in long-term memory according to an ecological rule that imposes family and series clustering as an organizational principle of the mental lexicon. To conclude, the claim is that the mental lexicon is constructed according to multiple dimensions: the perceptive salience of the word's morphological structure (enhanced by morphonotactics) and its formal-semantic relationships with the other coded words, in other words, its syntagmatic and paradigmatic dimensions.

#### REFERENCES


### Subject Index

#### A

accuracy 66, 78, 80–82, 88–89, 91–95, 97, 103, 105, 128 acoustic/s, acoustically 10, 12, 55–56, 59, 61–62, 66, 72–73 acquisition ¿rst language 16, 43, 55, 73, 77, 79–80, 97, 102, 123 of morphology 78 of morphonotactic clusters 77, 79–80, 104–105 of phonotactic clusters 79–80, 104 acquisitional 7, 13, 78, 103–105 adult/adolescent language processing 80– 82, 92, 97, 102 af¿x/ation 16, 20, 42, 88, 124, 135

#### B

Beats-and-Binding model 15, 18, 102, 104 phonological boundary/ies 12, 125–126 phonotactic boundary/ies 126, 130, 133– 134

#### C

casual speech *see also* speech production 15, 44 chunking 97, 124 cluster/s quadruple 16, 22–27, 31, 44, 47 triple 9, 15–16, 19, 21–22, 26–27, 29– 31, 33–41, 43, 45–47 complex/ity 12, 15, 17, 22–23, 38, 41, 43– 44, 47, 77–78, 103, 105–108, 123–125 compound/s, compounding 10–11, 16, 20, 26, 29–30, 41–43, 55, 77–78, 83–87, 89–94, 96–97 consonantal language 15, 19, 46 corpus linguistic 7, 9, 16, 20–21, 41, 78

#### D

decomposition 20, 89, 123–124

default 17, 19, 23–24, 26, 28–29, 81 deletion/s 8, 16, 42, 44–46, 54–55, 59, 62, 65–66, 78 derivation/s, derivational 10–11, 16, 20, 27–28, 30, 37–38, 41–42, 44–46, 78, 83, 85–87, 89–94, 96–97 derivative 77, 96–97, 125 development/al 11–12, 44, 46, 79, 105, 107, 110, 112–113, 115, 117

diachronic, diachrony 21, 24, 43–46

#### E

elicitation task 103 English 7–8, 15–17, 23, 28–29, 34–35, 45–46, 54–55, 80, 93, 102–105, 115–116 epenthesis 11, 108 experiment/s, experimental 7, 9–10, 12, 28, 43, 77, 80–91, 93–97, 85, 123–125, 127

#### F

facilitate, facilitation 11–12, 16, 20, 55, 58, 77–78, 81–82, 88–89, 94, 97, 104, 117, 125, 129–131, 133 familiarity effect/s 11, 77, 92, 95–97 rating/s 84, 90–91, 95 foreignness effect/s 77, 92, 95–96 rating/s 84, 87–88, 90–91 fragment monitoring task 82 French 9–12, 56–57, 60, 67–68, 71, 101, 103, 105–109, 112, 115–117, 124–127, 134 frequency effect/s 11, 77, 92, 94–97

#### G

German 7–12, 15–16, 19–22, 26, 29, 34, 38–39, 42–47, 53–54, 56–57, 66, 71, 73– 74, 77–86, 88–92, 95–98, 101, 104–106, 115, 123


syllable boundary/ies 41–42

#### T

token frequency 9, 18, 22, 24, 26, 28–29, 41, 44–45, 92–95 transparency, transparent 20, 24, 83, 124 type frequency 9, 18, 25–26, 28–29, 41, 45 type-token ratio 24–26, 47 typological 9, 15–16, 20, 44, 53

#### U

unstressed position 15, 45

#### V

visual word recognition task 78, 82, 87, 124

#### W

word boundary/ies 17, 55, 102 word-¿nal cluster/s 23, 26, 30, 34–35, 41, 46, 61, 73, 79, 103, 105, 113, 116–117 word-¿nal position/s 9, 11–12, 15, 19, 23, 34, 45–46, 57, 62, 79–82, 102–103, 109, 112–113, 116, 125–126, 129, 133 word-initial cluster/s 9, 20–23, 35, 38–41, 44, 46, 79, 102–103, 105, 109, 112, 116– 117, 126 word-initial position/s 11–12, 15, 19, 21, 34, 38, 41, 44–46, 57, 79, 81, 102–103, 108–109, 111–112, 123, 125–126, 129 word-internal boundary/ies 10, 42, 53, 55–56 word-internal cluster/s 20–21, 41–42, 45 word-internal/ly position 41 word-medial cluster/s 10, 19, 41, 46, 53, 57, 59, 67, 69–72, 103, 105 word-medial position/s 11, 19, 44, 57, 59–60, 67, 71, 79–82, 85, 108–110, 112– 113, 116, 123

This volume unites six contributions on morphonotactics of consonant clusters and its difference to phonotactics (in a narrow sense). Morphonotactics comprises that part of phonotactics (in the large sense) which is due to interaction with morphology. It deals prototypically with clusters which are due to morphological concatenation as in the word-final consonant cluster in Ger. (er/sie) mach-t '(he/she) make-s', which is morphonotactic vs. its phonotactic homophonous equivalent Macht 'power'. The opening chapter introduces into the area of morphonotactics and into the following five chapters which deal with German or French morphonotactics or both. The first represents the first corpus linguistic analysis of German morphonotactics based on a large electronic corpus, the second investigates phonetic processing in both languages, the remaining three, equally distributed between both languages, with the impact of (mor)phonotactics on processing and language acquisition. Thus, this volume unites phonological, morphological, phonetic, psycholinguistic, corpus linguistic and typological perspectives integrated into a series of experimental approaches. The volume publishes selectively results of a bilateral research project funded by the Agence Nationale de la Recherche (ANR) and the Austrian Science Fund (FWF).

Wolfgang U. DRESSLER is head of the Working Group "Comparative Psycholinguistics" at the Department of Linguistics at the University of Vienna.

Basilio CALDERONE is CNRS (French National Centre for Scientific Research) Research Engineer at the University of Toulouse Jean-Jaurès.

Sabine SOMMER-LOLEI is PhD student at the University of Vienna and recipient of a DOC-team fellowship of the Austrian Academy of Sciences.

Katharina KORECKY-KRÖLL is postdoc researcher at the Department of German Studies of the University of Vienna

US-Lay\_1+2 Dressler.indd Alle Seiten 13.10.2021 13:02:41

MADE IN EUROPE